-
公开(公告)号:US20140379743A1
公开(公告)日:2014-12-25
申请号:US14457869
申请日:2014-08-12
Applicant: Google Inc.
Inventor: Leonardo A. Laroco, Jr. , Nikola Jevtic , Nikolai V. Yakovenko , Jeffrey Reynar
IPC: G06F17/30
CPC classification number: G06F16/93 , G06F16/955 , G06N5/04 , G06N20/00
Abstract: A system and method for disambiguating references to entities in a document. In one embodiment, an iterative process is used to disambiguate references to entities in documents. An initial model is used to identify documents referring to an entity based on features contained in those documents. The occurrence of various features in these documents is measured. From the number occurrences of features in these documents, a second model is constructed. The second model is used to identify documents referring to the entity based on features contained in the documents. The process can be repeated, iteratively identifying documents referring to the entity and improving subsequent models based on those identifications. Additional features of the entity can be extracted from documents identified as referring to the entity.
-
公开(公告)号:US09760570B2
公开(公告)日:2017-09-12
申请号:US14300148
申请日:2014-06-09
Applicant: GOOGLE INC
Inventor: Leonardo A. Laroco, Jr. , Nikola Jevtic , Nikolai V. Yakovenko , Jeffrey Reynar
CPC classification number: G06F17/30011 , G06F17/30876 , G06N5/04 , G06N99/005
Abstract: A system and method for disambiguating references to entities in a document. In one embodiment, an iterative process is used to disambiguate references to entities in documents. An initial model is used to identify documents referring to an entity based on features contained in those documents. The occurrence of various features in these documents is measured. From the number occurrences of features in these documents, a second model is constructed. The second model is used to identify documents referring to the entity based on features contained in the documents. The process can be repeated, iteratively identifying documents referring to the entity and improving subsequent models based on those identifications. Additional features of the entity can be extracted from documents identified as referring to the entity.
-
3.
公开(公告)号:US20140289177A1
公开(公告)日:2014-09-25
申请号:US14300148
申请日:2014-06-09
Applicant: GOOGLE INC
Inventor: Leonardo A. Laroco, JR. , Nikola Jevtic , Nikolai V. Yakovenko , Jeffrey Reynar
CPC classification number: G06F17/30011 , G06F17/30876 , G06N5/04 , G06N99/005
Abstract: A system and method for disambiguating references to entities in a document. In one embodiment, an iterative process is used to disambiguate references to entities in documents. An initial model is used to identify documents referring to an entity based on features contained in those documents. The occurrence of various features in these documents is measured. From the number occurrences of features in these documents, a second model is constructed. The second model is used to identify documents referring to the entity based on features contained in the documents. The process can be repeated, iteratively identifying documents referring to the entity and improving subsequent models based on those identifications. Additional features of the entity can be extracted from documents identified as referring to the entity.
Abstract translation: 一种用于消除文档中对实体的引用的系统和方法。 在一个实施例中,迭代过程用于消除对文档中对实体的引用。 初始模型用于根据这些文档中包含的特征来识别引用实体的文档。 测量这些文件中各种特征的出现情况。 从这些文件中的特征数出现,构建第二个模型。 第二个模型用于根据文档中包含的功能来识别引用该实体的文档。 该过程可以重复,迭代地识别参考实体的文档,并且基于这些标识来改进随后的模型。 实体的附加特征可以从标识为引用实体的文档中提取出来。
-
-