AUTOMATED SOCIAL NETWORKING GRAPH MINING AND VISUALIZATION
    2.
    发明申请
    AUTOMATED SOCIAL NETWORKING GRAPH MINING AND VISUALIZATION 有权
    自动化社会网络采矿与可视化

    公开(公告)号:US20110283205A1

    公开(公告)日:2011-11-17

    申请号:US12780522

    申请日:2010-05-14

    IPC分类号: G06F3/048 G06F17/30 G06F15/16

    CPC分类号: G06F17/30867

    摘要: The automated social networking graph mining and visualization technique described herein mines social connections and allows creation of a social networking graph from general (not necessarily social-application specific) Web pages. The technique uses the distances between a person's/entity's name and related people's/entities names on one or more Web pages to determine connections between people/entities and the strengths of the connections. In one embodiment, the technique lays out these connections, and then clusters them, in a 2-D layout of a social networking graph that represents the Web connection strengths among the related people's or entities' names, by using a force-directed model.

    摘要翻译: 本文描述的自动化社交网络图挖掘和可视化技术挖掘社会关系,并允许从通用(不一定是社交应用专用)网页创建社交网络图。 该技术使用个人/实体的名称与一个或多个网页上的相关人员/实体名称之间的距离来确定人员/实体之间的连接以及连接的优势。 在一个实施例中,该技术设置了这些连接,然后通过使用力导向模型将它们聚类在代表相关人或实体名称中的Web连接强度的社交网络图的二维布局中。

    Interactive framework for name disambiguation
    3.
    发明授权
    Interactive framework for name disambiguation 有权
    互动框架的名称消歧

    公开(公告)号:US08538898B2

    公开(公告)日:2013-09-17

    申请号:US13118404

    申请日:2011-05-28

    IPC分类号: G06N5/00

    CPC分类号: G06N99/005 G06F17/30616

    摘要: A “Name Disambiguator” provides various techniques for implementing an interactive framework for resolving or disambiguating entity names (associated with objects such as publications) for entity searches where two or more same or similar names may refer to different entities. More specifically, the Name Disambiguator uses a combination of user input and automatic models to address the disambiguation problem. In various embodiments, the Name Disambiguator uses a two part process, including: 1) a global SVM trained from large sets of documents or objects in a simulated interactive mode, and 2) further personalization of local SVM models (associated with individual names or groups of names such as, for example, a group of coauthors) derived from the global SVM model. The result of this process is that large sets of documents or objects are rapidly and accurately condensed or clustered into ordered sets by that are organized by entity names.

    摘要翻译: “名称歧义者”提供了各种技术,用于实现用于解析或消除实体名称(与诸如出版物的对象相关联)的交互式框架,用于实体搜索,其中两个或多个相同或相似的名称可以指代不同的实体。 更具体地说,名称消歧器使用用户输入和自动模型的组合来解决消歧问题。 在各种实施例中,名称消歧器使用两部分过程,包括:1)以模拟交互模式从大量文档或对象训练的全局SVM,以及2)本地SVM模型的进一步个性化(与个体名称或组相关联 来自全球SVM模型的名称,例如一组合作者。 这个过程的结果是,大量的文档或对象可以通过按实体名称组织的快速,准确的浓缩或聚类成有序集。

    INTERACTIVE FRAMEWORK FOR NAME DISAMBIGUATION
    4.
    发明申请
    INTERACTIVE FRAMEWORK FOR NAME DISAMBIGUATION 有权
    名称撤销的互动框架

    公开(公告)号:US20120303557A1

    公开(公告)日:2012-11-29

    申请号:US13118404

    申请日:2011-05-28

    IPC分类号: G06F15/18

    CPC分类号: G06N99/005 G06F17/30616

    摘要: A “Name Disambiguator” provides various techniques for implementing an interactive framework for resolving or disambiguating entity names (associated with objects such as publications) for entity searches where two or more same or similar names may refer to different entities. More specifically, the Name Disambiguator uses a combination of user input and automatic models to address the disambiguation problem. In various embodiments, the Name Disambiguator uses a two part process, including: 1) a global SVM trained from large sets of documents or objects in a simulated interactive mode, and 2) further personalization of local SVM models (associated with individual names or groups of names such as, for example, a group of coauthors) derived from the global SVM model. The result of this process is that large sets of documents or objects are rapidly and accurately condensed or clustered into ordered sets by that are organized by entity names.

    摘要翻译: 名称消歧者提供各种技术,用于实现用于解析或消除实体名称(与诸如出版物的对象相关联)的交互式框架,用于实体搜索,其中两个或多个相同或相似的名称可以指代不同的实体。 更具体地说,名称消歧器使用用户输入和自动模型的组合来解决消歧问题。 在各种实施例中,名称消歧器使用两部分过程,包括:1)以模拟交互模式从大量文档或对象训练的全局SVM,以及2)本地SVM模型的进一步个性化(与个体名称或组相关联 来自全球SVM模型的名称,例如一组合作者。 这个过程的结果是,大量的文档或对象可以通过按实体名称组织的快速,准确的浓缩或聚类成有序集。

    Web object retrieval based on a language model
    5.
    发明授权
    Web object retrieval based on a language model 失效
    基于语言模型的Web对象检索

    公开(公告)号:US08001130B2

    公开(公告)日:2011-08-16

    申请号:US11459857

    申请日:2006-07-25

    IPC分类号: G06F17/30

    摘要: A method and system is provided for determining relevance of an object to a term based on a language model. The relevance system provides records extracted from web pages that relate to the object. To determine the relevance of the object to a term, the relevance system first determines, for each record of the object, a probability of generating that term using a language model of the record of that object. The relevance system then calculates the relevance of the object to the term by combining the probabilities. The relevance system may also weight the probabilities based on the accuracy or reliability of the extracted information for each data source.

    摘要翻译: 提供了一种基于语言模型来确定对象与术语的相关性的方法和系统。 相关系统提供从与该对象相关的网页提取的记录。 为了确定对象与术语的相关性,相关系统首先确定对象的每个记录,使用该对象的记录的语言模型生成该术语的概率。 相关系统然后通过组合概率来计算对象与该术语的相关性。 相关系统还可以基于每个数据源提取的信息的准确性或可靠性对概率进行加权。

    WEB OBJECT RETRIEVAL BASED ON A LANGUAGE MODEL
    6.
    发明申请
    WEB OBJECT RETRIEVAL BASED ON A LANGUAGE MODEL 审中-公开
    基于语言模型的WEB对象检索

    公开(公告)号:US20110264658A1

    公开(公告)日:2011-10-27

    申请号:US13175796

    申请日:2011-07-01

    IPC分类号: G06F17/30

    摘要: A method and system is provided for determining relevance of an object to a term based on a language model. The relevance system provides records extracted from web pages that relate to the object. To determine the relevance of the object to a term, the relevance system first determines, for each record of the object, a probability of generating that term using a language model of the record of that object. The relevance system then calculates the relevance of the object to the term by combining the probabilities. The relevance system may also weight the probabilities based on the accuracy or reliability of the extracted information for each data source.

    摘要翻译: 提供了一种基于语言模型来确定对象与术语的相关性的方法和系统。 相关系统提供从与该对象相关的网页提取的记录。 为了确定对象与术语的相关性,相关系统首先确定对象的每个记录,使用该对象的记录的语言模型生成该术语的概率。 相关系统然后通过组合概率来计算对象与该术语的相关性。 相关系统还可以基于每个数据源提取的信息的准确性或可靠性对概率进行加权。

    WEB OBJECT RETRIEVAL BASED ON A LANGUAGE MODEL
    7.
    发明申请
    WEB OBJECT RETRIEVAL BASED ON A LANGUAGE MODEL 失效
    基于语言模型的WEB对象检索

    公开(公告)号:US20080027910A1

    公开(公告)日:2008-01-31

    申请号:US11459857

    申请日:2006-07-25

    IPC分类号: G06F17/30

    摘要: A method and system is provided for determining relevance of an object to a term based on a language model. The relevance system provides records extracted from web pages that relate to the object. To determine the relevance of the object to a term, the relevance system first determines, for each record of the object, a probability of generating that term using a language model of the record of that object. The relevance system then calculates the relevance of the object to the term by combining the probabilities. The relevance system may also weight the probabilities based on the accuracy or reliability of the extracted information for each data source.

    摘要翻译: 提供了一种基于语言模型来确定对象与术语的相关性的方法和系统。 相关系统提供从与该对象相关的网页提取的记录。 为了确定对象与术语的相关性,相关系统首先确定对象的每个记录,使用该对象的记录的语言模型生成该术语的概率。 相关系统然后通过组合概率来计算对象与该术语的相关性。 相关系统还可以基于每个数据源提取的信息的准确性或可靠性对概率进行加权。

    Web-scale entity relationship extraction that extracts pattern(s) based on an extracted tuple
    8.
    发明授权
    Web-scale entity relationship extraction that extracts pattern(s) based on an extracted tuple 有权
    基于提取的元组提取模式的Web规模实体关系提取

    公开(公告)号:US08504490B2

    公开(公告)日:2013-08-06

    申请号:US12757722

    申请日:2010-04-09

    IPC分类号: G06F15/18

    摘要: Methods and systems for Web-scale entity relationship extraction are usable to build large-scale entity relationship graphs from any data corpora stored on a computer-readable medium or accessible through a network. Such entity relationship graphs may be used to navigate previously undiscoverable relationships among entities within data corpora. Additionally, the entity relationship extraction may be configured to utilize discriminative models to jointly model correlated data found within the selected corpora.

    摘要翻译: 用于Web规模实体关系提取的方法和系统可用于从存储在计算机可读介质上或可通过网络访问的任何数据语料库构建大型实体关系图。 这样的实体关系图可以用于导航数据语料库中的实体之间的先前不可发现的关系。 此外,实体关系提取可以被配置为利用歧视模型来共同建模在所选择的语料库内发现的相关数据。

    WEB-SCALE ENTITY RELATIONSHIP EXTRACTION
    9.
    发明申请
    WEB-SCALE ENTITY RELATIONSHIP EXTRACTION 有权
    WEB规模实体关系提取

    公开(公告)号:US20110251984A1

    公开(公告)日:2011-10-13

    申请号:US12757722

    申请日:2010-04-09

    IPC分类号: G06F15/18 G06F17/30

    摘要: Methods and systems for Web-scale entity relationship extraction are usable to build large-scale entity relationship graphs from any data corpora stored on a computer-readable medium or accessible through a network. Such entity relationship graphs may be used to navigate previously undiscoverable relationships among entities within data corpora. Additionally, the entity relationship extraction may be configured to utilize discriminative models to jointly model correlated data found within the selected corpora.

    摘要翻译: 用于Web规模实体关系提取的方法和系统可用于从存储在计算机可读介质上或可通过网络访问的任何数据语料库构建大型实体关系图。 这样的实体关系图可以用于导航数据语料库中的实体之间的先前不可发现的关系。 此外,实体关系提取可以被配置为利用歧视模型来共同建模在所选择的语料库内发现的相关数据。

    Webpage entity extraction through joint understanding of page structures and sentences
    10.
    发明授权
    Webpage entity extraction through joint understanding of page structures and sentences 有权
    网页实体提取通过联合理解页面结构和句子

    公开(公告)号:US09092424B2

    公开(公告)日:2015-07-28

    申请号:US12569912

    申请日:2009-09-30

    IPC分类号: G06F17/00 G06F17/27

    CPC分类号: G06F17/278

    摘要: Described is a technology for understanding entities of a webpage, e.g., to label the entities on the webpage. An iterative and bidirectional framework processes a webpage, including a text understanding component (e.g., extended Semi-CRF model) that provides text segmentation features to a structure understanding component (e.g., extended HCRF model). The structure understanding component uses the text segmentation features and visual layout features of the webpage to identify a structure (e.g., labeled block). The text understanding component in turn uses the labeled block to further understand the text. The process continues iteratively until a similarity criterion is met, at which time the entities may be labeled. Also described is the use of multiple mentions of a set of text in the webpage to help in labeling an entity.

    摘要翻译: 描述了一种用于理解网页的实体的技术,例如标记网页上的实体。 迭代和双向框架处理网页,包括向结构理解组件(例如,扩展HCRF模型)提供文本分段特征的文本理解组件(例如,扩展Semi-CRF模型)。 结构理解组件使用网页的文本分割特征和视觉布局特征来识别结构(例如,标记块)。 文本理解组件依次使用标记块来进一步理解文本。 该过程继续迭代直到满足相似性标准,此时实体可以被标记。 还描述了使用多个提及网页中的一组文本来帮助标注一个实体。