Data Services for Enterprises Leveraging Search System Data Assets
    2.
    发明申请
    Data Services for Enterprises Leveraging Search System Data Assets 审中-公开
    企业数据服务利用搜索系统数据资产

    公开(公告)号:US20130346464A1

    公开(公告)日:2013-12-26

    申请号:US13527601

    申请日:2012-06-20

    IPC分类号: G06F15/16

    CPC分类号: G06Q10/10

    摘要: A data service system is described herein which processes raw data assets from at least one network-accessible system (such as a search system), to produce processed data assets. Enterprise applications can then leverage the processed data assets to perform various environment-specific tasks. In one implementation, the data service system can generate any of: synonym resources for use by an enterprise application in providing synonyms for specified terms associated with entities; augmentation resources for use by an enterprise application in providing supplemental information for specified seed information; and spelling-correction resources for use by an enterprise application in providing spelling information for specified terms, and so on.

    摘要翻译: 本文描述了一种数据服务系统,其处理来自至少一个网络可访问系统(例如搜索系统)的原始数据资产以产生处理的数据资产。 企业应用程序可以利用已处理的数据资产来执行各种环境特定任务。 在一个实现中,数据服务系统可以生成以下任何一种:供企业应用使用的同义词资源,为与实体相关联的指定术语提供同义词; 增加资源供企业应用用于提供指定种子信息的补充信息; 以及企业应用程序为指定的术语提供拼写信息的拼写纠正资源等。

    Foreign-key detection
    3.
    发明授权
    Foreign-key detection 有权
    外键检测

    公开(公告)号:US08386529B2

    公开(公告)日:2013-02-26

    申请号:US12709508

    申请日:2010-02-21

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30306

    摘要: This patent application relates to foreign-key detection. One implementation obtains a set of data tables. This implementation automatically determines foreign-key relationships of columns from separate tables of the set.

    摘要翻译: 本专利申请涉及外键检测。 一个实现获得一组数据表。 此实现将自动确定集合的不同表中的列的外键关系。

    Foreign-Key Detection
    4.
    发明申请
    Foreign-Key Detection 有权
    外键检测

    公开(公告)号:US20110208748A1

    公开(公告)日:2011-08-25

    申请号:US12709508

    申请日:2010-02-21

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30306

    摘要: This patent application relates to foreign-key detection. One implementation obtains a set of data tables. This implementation automatically determines foreign-key relationships of columns from separate tables of the set.

    摘要翻译: 本专利申请涉及外键检测。 一个实现获得一组数据表。 此实现将自动确定集合的不同表中的列的外键关系。

    Targeted disambiguation of named entities
    5.
    发明授权
    Targeted disambiguation of named entities 有权
    指定实体的消除歧视

    公开(公告)号:US09594831B2

    公开(公告)日:2017-03-14

    申请号:US13531493

    申请日:2012-06-22

    IPC分类号: G06F17/30 G06F17/27

    CPC分类号: G06F17/30687 G06F17/278

    摘要: A targeted disambiguation system is described herein which determines true mentions of a list of named entities in a collection of documents. The list of named entities is homogenous in the sense that the entities pertain to the same subject matter domain. The system determines the true mentions by leveraging the homogeneity in the list, and, more specifically by applying a context similarity hypothesis, a co-mention hypothesis, and an interdependency hypothesis. In one implementation, the system executes its analysis using a graph-based model. The system can operate without the existence of additional information regarding the entities in the list; nevertheless, if such information is available, the system can integrate it into its analysis.

    摘要翻译: 本文描述了一种有针对性的消歧系统,其确定了文档集合中真实提到的命名实体的列表。 在实体属于相同主题领域的意义上,命名实体的列表是同质的。 系统通过利用列表中的同质性来确定真实的提及,更具体地说,通过应用上下文相似性假设,共同提及假设和相互依赖性假设。 在一个实现中,系统使用基于图的模型来执行其分析。 该系统可以在没有关于列表中的实体的附加信息的情况下运行; 然而,如果这些信息可用,系统可以将其整合到其分析中。

    TARGETED DISAMBIGUATION OF NAMED ENTITIES
    6.
    发明申请
    TARGETED DISAMBIGUATION OF NAMED ENTITIES 有权
    有名的实体的失明

    公开(公告)号:US20130346421A1

    公开(公告)日:2013-12-26

    申请号:US13531493

    申请日:2012-06-22

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30687 G06F17/278

    摘要: A targeted disambiguation system is described herein which determines true mentions of a list of named entities in a collection of documents. The list of named entities is homogenous in the sense that the entities pertain to the same subject matter domain. The system determines the true mentions by leveraging the homogeneity in the list, and, more specifically by applying a context similarity hypothesis, a co-mention hypothesis, and an interdependency hypothesis. In one implementation, the system executes its analysis using a graph-based model. The system can operate without the existence of additional information regarding the entities in the list; nevertheless, if such information is available, the system can integrate it into its analysis.

    摘要翻译: 本文描述了一种有针对性的消歧系统,其确定了文档集合中真实提到的命名实体的列表。 在实体属于相同主题领域的意义上,命名实体的列表是同质的。 系统通过利用列表中的同质性来确定真实的提及,更具体地说,通过应用上下文相似性假设,共同提及假设和相互依赖性假设。 在一个实现中,系统使用基于图的模型来执行其分析。 该系统可以在没有关于列表中的实体的附加信息的情况下运行; 然而,如果这些信息可用,系统可以将其整合到其分析中。

    TAGGING ENTITIES WITH DESCRIPTIVE PHRASES
    7.
    发明申请
    TAGGING ENTITIES WITH DESCRIPTIVE PHRASES 有权
    用描述性标签标签实体

    公开(公告)号:US20130132381A1

    公开(公告)日:2013-05-23

    申请号:US13298349

    申请日:2011-11-17

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30864 G06F17/30277

    摘要: A plurality of description phrases associated with a first domain may be determined, based on an analysis of a first plurality of documents to determine co-occurrences of the description phrases with one or more name labels associated with the first domain. An entity associated with the first domain may be obtained. An analysis of a second plurality of documents may be initiated to identify co-occurrences of mentions of the obtained entity and one or more of the plurality of description phrases, and contexts associated with each of the co-occurrences of the mentions and description phrases, in each one of the second plurality of documents. A description tag association between the obtained entity and one of the description phrases may be determined, based on an analysis of the identified contexts.

    摘要翻译: 可以基于第一多个文档的分析来确定与第一域相关联的多个描述短语,以确定描述短语与与第一域相关联的一个或多个名称标签的共同出现。 可以获得与第一域相关联的实体。 可以启动对第二多个文档的分析,以识别获得的实体的提及和多个描述短语中的一个或多个以及与提及和描述短语的共同出现中的每一个相关联的上下文, 在第二多个文档的每一个中。 可以基于对所识别的上下文的分析来确定获得的实体与描述短语之一之间的描述标签关联。

    Tagging entities with descriptive phrases
    8.
    发明授权
    Tagging entities with descriptive phrases 有权
    使用描述性短语标记实体

    公开(公告)号:US09298825B2

    公开(公告)日:2016-03-29

    申请号:US13298349

    申请日:2011-11-17

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30864 G06F17/30277

    摘要: A plurality of description phrases associated with a first domain may be determined, based on an analysis of a first plurality of documents to determine co-occurrences of the description phrases with one or more name labels associated with the first domain. An entity associated with the first domain may be obtained. An analysis of a second plurality of documents may be initiated to identify co-occurrences of mentions of the obtained entity and one or more of the plurality of description phrases, and contexts associated with each of the co-occurrences of the mentions and description phrases, in each one of the second plurality of documents. A description tag association between the obtained entity and one of the description phrases may be determined, based on an analysis of the identified contexts.

    摘要翻译: 可以基于第一多个文档的分析来确定与第一域相关联的多个描述短语,以确定描述短语与与第一域相关联的一个或多个名称标签的共同出现。 可以获得与第一域相关联的实体。 可以启动对第二多个文档的分析,以识别获得的实体的提及和多个描述短语中的一个或多个以及与提及和描述短语的共同出现中的每一个相关联的上下文, 在第二多个文档的每一个中。 可以基于对所识别的上下文的分析来确定获得的实体与描述短语之一之间的描述标签关联。

    ROBUST DISCOVERY OF ENTITY SYNONYMS USING QUERY LOGS
    9.
    发明申请
    ROBUST DISCOVERY OF ENTITY SYNONYMS USING QUERY LOGS 有权
    使用查询记录对实体同步的可靠发现

    公开(公告)号:US20130232129A1

    公开(公告)日:2013-09-05

    申请号:US13487260

    申请日:2012-06-04

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30672

    摘要: A similarity analysis framework is described herein which leverages two or more similarity analysis functions to generate synonyms for an entity reference string re. The functions are selected such that the synonyms that are generated by the framework satisfy a core set of synonym-related properties. The functions operate by leveraging query log data. One similarity analysis function takes into consideration the strength of similarity between a particular candidate string se and an entity reference string re even in the presence of sparse query log data, while another function takes into account the classes of se and re. The framework also provides indexing mechanisms that expedite its computations. The framework also provides a reduction module for converting long entity reference strings into shorter strings, where each shorter string (if found) contains a subset of the terms in its longer counterpart.

    摘要翻译: 本文描述了相似性分析框架,其利用两个或多个相似性分析功能来生成实体参考字符串re的同义词。 选择这些功能使得由框架生成的同义词满足同义词相关属性的核心集合。 这些功能通过利用查询日志数据进行操作。 一个相似性分析功能考虑到即使在存在稀疏查询日志数据的情况下,特定候选字符串se和实体引用字符串之间的相似度的强度,而另一个函数考虑了se和re的类别。 该框架还提供了加速其计算的索引机制。 该框架还提供了一个缩减模块,用于将长实体引用字符串转换为较短的字符串,其中每个较短的字符串(如果找到)包含其较长对应项中的术语的子集。

    Robust discovery of entity synonyms using query logs
    10.
    发明授权
    Robust discovery of entity synonyms using query logs 有权
    使用查询日志强大发现实体同义词

    公开(公告)号:US08745019B2

    公开(公告)日:2014-06-03

    申请号:US13487260

    申请日:2012-06-04

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30672

    摘要: A similarity analysis framework is described herein which leverages two or more similarity analysis functions to generate synonyms for an entity reference string re. The functions are selected such that the synonyms that are generated by the framework satisfy a core set of synonym-related properties. The functions operate by leveraging query log data. One similarity analysis function takes into consideration the strength of similarity between a particular candidate string se and an entity reference string re even in the presence of sparse query log data, while another function takes into account the classes of se and re. The framework also provides indexing mechanisms that expedite its computations. The framework also provides a reduction module for converting long entity reference strings into shorter strings, where each shorter string (if found) contains a subset of the terms in its longer counterpart.

    摘要翻译: 本文描述了相似性分析框架,其利用两个或多个相似性分析功能来生成实体参考字符串re的同义词。 选择这些功能使得由框架生成的同义词满足同义词相关属性的核心集合。 这些功能通过利用查询日志数据进行操作。 一个相似性分析功能考虑到即使在存在稀疏查询日志数据的情况下,特定候选字符串se和实体引用字符串之间的相似度的强度,而另一个函数考虑了se和re的类别。 该框架还提供了加速其计算的索引机制。 该框架还提供了一个缩减模块,用于将长实体引用字符串转换为较短的字符串,其中每个较短的字符串(如果找到)包含其较长对应项中的术语的子集。