Entity normalization via name normalization

    公开(公告)号:US09710549B2

    公开(公告)日:2017-07-18

    申请号:US14229774

    申请日:2014-03-28

    Applicant: Google Inc.

    Inventor: Jonathan T. Betz

    Abstract: Systems and methods for normalizing entities via name normalization are disclosed. In some implementations, a computer-implemented method of identifying duplicate objects in a plurality of objects is provided. Each object in the plurality of objects is associated with one or more facts, and each of the one or more facts having a value. The method includes: using a computer processor to perform: associating facts extracted from web documents with a plurality of objects; and for each of the plurality of objects, normalizing the value of a name fact, the name fact being among one or more facts associated with the object; processing the plurality of objects in accordance with the normalized value of the name facts of the plurality of objects. In some implementations, normalizing the value of the name fact is optionally carried out by applying a group of normalization rules to the value of the name fact.

    Unsupervised extraction of facts
    2.
    发明授权
    Unsupervised extraction of facts 有权
    无监督提取事实

    公开(公告)号:US09558186B2

    公开(公告)日:2017-01-31

    申请号:US14460117

    申请日:2014-08-14

    Applicant: GOOGLE INC.

    Abstract: A system and method for extracting facts from documents. A fact is extracted from a first document. The attribute and value of the fact extracted from the first document are used as a seed attribute-value pair. A second document containing the seed attribute-value pair is analyzed to determine a contextual pattern used in the second document. The contextual pattern is used to extract other attribute-value pairs from the second document. The extracted attributes and values are stored as facts.

    Abstract translation: 一种从文件中提取事实的系统和方法。 从第一个文档中提取一个事实。 从第一个文档提取的事实的属性和值被用作种子属性 - 值对。 分析包含种子属性值对的第二文档以确定在第二文档中使用的上下文模式。 上下文模式用于从第二个文档中提取其他属性值对。 提取的属性和值作为事实存储。

    Entity normalization via name normalization

    公开(公告)号:US10223406B2

    公开(公告)日:2019-03-05

    申请号:US15637438

    申请日:2017-06-29

    Applicant: Google Inc.

    Inventor: Jonathan T. Betz

    Abstract: Systems and methods for normalizing entities via name normalization are disclosed. In some implementations, a computer-implemented method of identifying duplicate objects in a plurality of objects is provided. Each object in the plurality of objects is associated with one or more facts, and each of the one or more facts having a value. The method includes: using a computer processor to perform: associating facts extracted from web documents with a plurality of objects; and for each of the plurality of objects, normalizing the value of a name fact, the name fact being among one or more facts associated with the object; processing the plurality of objects in accordance with the normalized value of the name facts of the plurality of objects. In some implementations, normalizing the value of the name fact is optionally carried out by applying a group of normalization rules to the value of the name fact.

    Automatic object reference identification and linking in a browseable fact repository
    4.
    发明授权
    Automatic object reference identification and linking in a browseable fact repository 有权
    在可浏览的事实库中自动对象引用标识和链接

    公开(公告)号:US09092495B2

    公开(公告)日:2015-07-28

    申请号:US14194534

    申请日:2014-02-28

    Applicant: GOOGLE INC.

    CPC classification number: G06F17/30572 G06F17/30887

    Abstract: Systems and methods for automatic object reference identification and linking in a browseable fact repository database are provided. In some implementations, a method includes, identifying a set of values from a plurality of facts associated with an entity. The plurality of facts are stored in a fact repository, and a respective fact includes: an attribute and a corresponding value. The method further includes, responsive to a search for a first value included in a first fact in the plurality of facts: identifying a second fact associated with the entity; and causing to be displayed to a user: a link associated with the second fact, and information representing a confidence value associated with the second fact. The link, when selected, invokes a search of the fact repository in accordance with one or more search parameters, which include a value corresponding to an attribute included in the second fact.

    Abstract translation: 提供了在可浏览的事实库数据库中自动对象引用识别和链接的系统和方法。 在一些实现中,一种方法包括:从与实体相关联的多个事实中识别一组值。 多个事实存储在事实存储库中,并且相应的事实包括:属性和对应的值。 该方法还包括响应于在多个事实中搜索包括在第一事实中的第一值:识别与该实体相关联的第二事实; 并且导致向用户显示:与第二事实相关联的链接,以及表示与第二事实相关联的置信度值的信息。 该链接被选择时,根据一个或多个搜索参数调用事实存储库的搜索,其中包括对应于包括在第二个事实中的属性的值。

    UNSUPERVISED EXTRACTION OF FACTS
    5.
    发明申请
    UNSUPERVISED EXTRACTION OF FACTS 有权
    未经实质提取的事实

    公开(公告)号:US20140372473A1

    公开(公告)日:2014-12-18

    申请号:US14460117

    申请日:2014-08-14

    Applicant: GOOGLE INC.

    Abstract: A system and method for extracting facts from documents. A fact is extracted from a first document. The attribute and value of the fact extracted from the first document are used as a seed attribute-value pair. A second document containing the seed attribute-value pair is analyzed to determine a contextual pattern used in the second document. The contextual pattern is used to extract other attribute-value pairs from the second document. The extracted attributes and values are stored as facts.

    Abstract translation: 一种从文件中提取事实的系统和方法。 从第一个文档中提取一个事实。 将从第一个文档提取的事实的属性和值用作种子属性 - 值对。 分析包含种子属性值对的第二文档以确定在第二文档中使用的上下文模式。 上下文模式用于从第二个文档中提取其他属性值对。 提取的属性和值作为事实存储。

    Entity Normalization Via Name Normalization
    6.
    发明申请

    公开(公告)号:US20170300524A1

    公开(公告)日:2017-10-19

    申请号:US15637438

    申请日:2017-06-29

    Applicant: Google Inc.

    Inventor: Jonathan T. Betz

    Abstract: Systems and methods for normalizing entities via name normalization are disclosed. In some implementations, a computer-implemented method of identifying duplicate objects in a plurality of objects is provided. Each object in the plurality of objects is associated with one or more facts, and each of the one or more facts having a value. The method includes: using a computer processor to perform: associating facts extracted from web documents with a plurality of objects; and for each of the plurality of objects, normalizing the value of a name fact, the name fact being among one or more facts associated with the object; processing the plurality of objects in accordance with the normalized value of the name facts of the plurality of objects. In some implementations, normalizing the value of the name fact is optionally carried out by applying a group of normalization rules to the value of the name fact.

    Entity Normalization Via Name Normalization
    7.
    发明申请
    Entity Normalization Via Name Normalization 有权
    实体规范化通过名称规范化

    公开(公告)号:US20140214778A1

    公开(公告)日:2014-07-31

    申请号:US14229774

    申请日:2014-03-28

    Applicant: GOOGLE INC.

    Inventor: Jonathan T. Betz

    Abstract: Systems and methods for normalizing entities via name normalization are disclosed. In some implementations, a computer-implemented method of identifying duplicate objects in a plurality of objects is provided. Each object in the plurality of objects is associated with one or more facts, and each of the one or more facts having a value. The method includes: using a computer processor to perform: associating facts extracted from web documents with a plurality of objects; and for each of the plurality of objects, normalizing the value of a name fact, the name fact being among one or more facts associated with the object; processing the plurality of objects in accordance with the normalized value of the name facts of the plurality of objects. In some implementations, normalizing the value of the name fact is optionally carried out by applying a group of normalization rules to the value of the name fact.

    Abstract translation: 公开了通过名称归一化来对实体进行归一化的系统和方法。 在一些实现中,提供了一种识别多个对象中的重复对象的计算机实现的方法。 多个对象中的每个对象与一个或多个事实相关联,并且一个或多个事实中的每一个具有值。 该方法包括:使用计算机处理器执行:将从web文档提取的事实与多个对象相关联; 并且对于所述多个对象中的每一个,对所述名称事实的值进行归一化,所述名称事实在与所述对象相关联的一个或多个事实之中; 根据多个对象的名称事实的归一化值对多个对象进行处理。 在一些实现中,通过将一组归一化规则应用于名称事实的值来可选地执行名称事实的值的标准化。

    AUTOMATIC OBJECT REFERENCE IDENTIFICATION AND LINKING IN A BROWSEABLE FACT REPOSITORY
    8.
    发明申请
    AUTOMATIC OBJECT REFERENCE IDENTIFICATION AND LINKING IN A BROWSEABLE FACT REPOSITORY 有权
    自动对象参考标识和链接在可浏览的事实报告

    公开(公告)号:US20140195520A1

    公开(公告)日:2014-07-10

    申请号:US14194534

    申请日:2014-02-28

    Applicant: GOOGLE INC.

    CPC classification number: G06F17/30572 G06F17/30887

    Abstract: Systems and methods for automatic object reference identification and linking in a browseable fact repository database are provided. In some implementations, a method includes, identifying a set of values from a plurality of facts associated with an entity. The plurality of facts are stored in a fact repository, and a respective fact includes: an attribute and a corresponding value. The method further includes, responsive to a search for a first value included in a first fact in the plurality of facts: identifying a second fact associated with the entity; and causing to be displayed to a user: a link associated with the second fact, and information representing a confidence value associated with the second fact. The link, when selected, invokes a search of the fact repository in accordance with one or more search parameters, which include a value corresponding to an attribute included in the second fact.

    Abstract translation: 提供了在可浏览的事实库数据库中自动对象引用识别和链接的系统和方法。 在一些实现中,一种方法包括:从与实体相关联的多个事实中识别一组值。 多个事实存储在事实存储库中,并且相应的事实包括:属性和对应的值。 该方法还包括响应于在多个事实中搜索包含在第一事实中的第一值:识别与该实体相关联的第二事实; 并且导致向用户显示:与第二事实相关联的链接,以及表示与第二事实相关联的置信度值的信息。 该链接被选择时,根据一个或多个搜索参数调用事实存储库的搜索,其中包括对应于包括在第二个事实中的属性的值。

Patent Agency Ranking