FINDING AND DISAMBIGUATING REFERENCES TO ENTITIES ON WEB PAGES

    公开(公告)号:US20140379743A1

    公开(公告)日:2014-12-25

    申请号:US14457869

    申请日:2014-08-12

    Applicant: Google Inc.

    CPC classification number: G06F16/93 G06F16/955 G06N5/04 G06N20/00

    Abstract: A system and method for disambiguating references to entities in a document. In one embodiment, an iterative process is used to disambiguate references to entities in documents. An initial model is used to identify documents referring to an entity based on features contained in those documents. The occurrence of various features in these documents is measured. From the number occurrences of features in these documents, a second model is constructed. The second model is used to identify documents referring to the entity based on features contained in the documents. The process can be repeated, iteratively identifying documents referring to the entity and improving subsequent models based on those identifications. Additional features of the entity can be extracted from documents identified as referring to the entity.

    Finding and disambiguating references to entities on web pages

    公开(公告)号:US09760570B2

    公开(公告)日:2017-09-12

    申请号:US14300148

    申请日:2014-06-09

    Applicant: GOOGLE INC

    CPC classification number: G06F17/30011 G06F17/30876 G06N5/04 G06N99/005

    Abstract: A system and method for disambiguating references to entities in a document. In one embodiment, an iterative process is used to disambiguate references to entities in documents. An initial model is used to identify documents referring to an entity based on features contained in those documents. The occurrence of various features in these documents is measured. From the number occurrences of features in these documents, a second model is constructed. The second model is used to identify documents referring to the entity based on features contained in the documents. The process can be repeated, iteratively identifying documents referring to the entity and improving subsequent models based on those identifications. Additional features of the entity can be extracted from documents identified as referring to the entity.

    FINDING AND DISAMBIGUATING REFERENCES TO ENTITIES ON WEB PAGES
    3.
    发明申请
    FINDING AND DISAMBIGUATING REFERENCES TO ENTITIES ON WEB PAGES 有权
    查找和删除网页上的实体参考

    公开(公告)号:US20140289177A1

    公开(公告)日:2014-09-25

    申请号:US14300148

    申请日:2014-06-09

    Applicant: GOOGLE INC

    CPC classification number: G06F17/30011 G06F17/30876 G06N5/04 G06N99/005

    Abstract: A system and method for disambiguating references to entities in a document. In one embodiment, an iterative process is used to disambiguate references to entities in documents. An initial model is used to identify documents referring to an entity based on features contained in those documents. The occurrence of various features in these documents is measured. From the number occurrences of features in these documents, a second model is constructed. The second model is used to identify documents referring to the entity based on features contained in the documents. The process can be repeated, iteratively identifying documents referring to the entity and improving subsequent models based on those identifications. Additional features of the entity can be extracted from documents identified as referring to the entity.

    Abstract translation: 一种用于消除文档中对实体的引用的系统和方法。 在一个实施例中,迭代过程用于消除对文档中对实体的引用。 初始模型用于根据这些文档中包含的特征来识别引用实体的文档。 测量这些文件中各种特征的出现情况。 从这些文件中的特征数出现,构建第二个模型。 第二个模型用于根据文档中包含的功能来识别引用该实体的文档。 该过程可以重复,迭代地识别参考实体的文档,并且基于这些标识来改进随后的模型。 实体的附加特征可以从标识为引用实体的文档中提取出来。

Patent Agency Ranking