发明申请
US20090327877A1 SYSTEM AND METHOD FOR DISAMBIGUATING TEXT LABELING CONTENT OBJECTS
审中-公开
消除文本标签内容对象的系统和方法
- 专利标题: SYSTEM AND METHOD FOR DISAMBIGUATING TEXT LABELING CONTENT OBJECTS
- 专利标题(中): 消除文本标签内容对象的系统和方法
-
申请号: US12164039申请日: 2008-06-28
-
公开(公告)号: US20090327877A1公开(公告)日: 2009-12-31
- 发明人: Malcolm Slaney , Kilian Quirin Weinberger , Roelof van Zwol
- 申请人: Malcolm Slaney , Kilian Quirin Weinberger , Roelof van Zwol
- 申请人地址: US CA Sunnyvale
- 专利权人: Yahoo! Inc.
- 当前专利权人: Yahoo! Inc.
- 当前专利权人地址: US CA Sunnyvale
- 主分类号: G06F17/27
- IPC分类号: G06F17/27
摘要:
An improved system and method for disambiguating text strings labeling content objects is provided. A text string set may be received from a user. Frequencies of co-occurring text strings in a text collection may be obtained, and a disambiguation measure may be determined for a pair of text strings that each co-occur with a text string in the text string set. The disambiguation measure may be based on a weighted KL divergence of text string distributions that maximizes the value of divergence when a text string set may occur in different contexts. A disambiguation measure may be determined for a list of the top most common pairs of text strings that co-occur with the text string set, and the pairs of text strings may be output in decreasing order by disambiguation measure for those pairs of text strings with a disambiguation measure that exceeds a threshold.
信息查询