发明授权
US07685201B2 Person disambiguation using name entity extraction-based clustering
有权
使用基于名称实体提取的聚类方法消除歧义
- 专利标题: Person disambiguation using name entity extraction-based clustering
- 专利标题(中): 使用基于名称实体提取的聚类方法消除歧义
-
申请号: US11796818申请日: 2007-04-30
-
公开(公告)号: US07685201B2公开(公告)日: 2010-03-23
- 发明人: Hua-Jun Zeng , Shen Huang , Zheng Chen , Jian Wang
- 申请人: Hua-Jun Zeng , Shen Huang , Zheng Chen , Jian Wang
- 申请人地址: US WA Redmond
- 专利权人: Microsoft Corporation
- 当前专利权人: Microsoft Corporation
- 当前专利权人地址: US WA Redmond
- 主分类号: G06F7/00
- IPC分类号: G06F7/00 ; G06F17/30
摘要:
Described is a technology for disambiguating data corresponding to persons that are located from search results, so that different persons having the same name can be clearly distinguished. Name entity extraction locates words (terms) that are within a certain distance of persons' names in the search results. The terms are used in disambiguating search results that correspond to different persons having the same name, such as location information, organization information, career information, and/or partner information. In one example, each person is represented as a vector, and similarity among vectors is calculated based on weighting that corresponds to nearness of the terms to a person, and/or the types of terms. Based on the similarity data, the person vectors that represent the same person are then merged into one cluster, so that each cluster represents (to a high probability) only one distinct person.