发明申请
US20090327877A1 SYSTEM AND METHOD FOR DISAMBIGUATING TEXT LABELING CONTENT OBJECTS 审中-公开
消除文本标签内容对象的系统和方法

SYSTEM AND METHOD FOR DISAMBIGUATING TEXT LABELING CONTENT OBJECTS
摘要:
An improved system and method for disambiguating text strings labeling content objects is provided. A text string set may be received from a user. Frequencies of co-occurring text strings in a text collection may be obtained, and a disambiguation measure may be determined for a pair of text strings that each co-occur with a text string in the text string set. The disambiguation measure may be based on a weighted KL divergence of text string distributions that maximizes the value of divergence when a text string set may occur in different contexts. A disambiguation measure may be determined for a list of the top most common pairs of text strings that co-occur with the text string set, and the pairs of text strings may be output in decreasing order by disambiguation measure for those pairs of text strings with a disambiguation measure that exceeds a threshold.
信息查询
0/0