FRAMEWORK FOR DOCUMENT KNOWLEDGE EXTRACTION
    1.
    发明申请
    FRAMEWORK FOR DOCUMENT KNOWLEDGE EXTRACTION 审中-公开
    文件知识提取框架

    公开(公告)号:US20130246435A1

    公开(公告)日:2013-09-19

    申请号:US13419690

    申请日:2012-03-14

    IPC分类号: G06F17/30

    CPC分类号: G06F16/355

    摘要: A knowledge extraction framework may iteratively enrich an ontology that is used to classify structured knowledge obtained from web pages based on structured knowledge previously acquired from other web pages. The framework may enable a user to define the ontology for extracting structured knowledge from a plurality of web pages. The framework applies the ontology using a supervised extraction algorithm to extract seed information from a set of web pages. The framework further applies an unsupervised extraction algorithm to extract the structured knowledge from an additional set of web pages. The framework subsequently maps the structured knowledge to the ontology based on the seed information to enrich the ontology.

    摘要翻译: 知识提取框架可以迭代地丰富用于基于先前从其他网页获取的结构化知识对从网页获得的结构化知识进行分类的本体。 框架可以使用户能够定义用于从多个网页提取结构化知识的本体。 该框架使用监督提取算法应用本体,从一组网页中提取种子信息。 该框架进一步应用无监督提取算法从一组额外的网页提取结构化知识。 该框架随后基于种子信息将结构化知识映射到本体,以丰富本体。

    Random walk on query pattern graph for query task classification
    2.
    发明授权
    Random walk on query pattern graph for query task classification 有权
    随机走在查询模式图上进行查询任务分类

    公开(公告)号:US08838512B2

    公开(公告)日:2014-09-16

    申请号:US13089103

    申请日:2011-04-18

    IPC分类号: G06F7/00 G06N5/02 G06F17/30

    CPC分类号: G06F17/30864

    摘要: A classification process may reduce the computational resources and time required to collect and classify training data utilized to enable a user to effectively access online information. According to some implementations, training data is established by defining one or more seed queries and query patterns. A bi-partite graph may be constructed using the seed query and query pattern information. A traversal of the bi-partite graph can be performed to expand the training data to encompass sufficient data to perform classification of the present search task.

    摘要翻译: 分类过程可以减少收集和分类用于使用户有效访问在线信息的训练数据所需的计算资源和时间。 根据一些实施方式,通过定义一个或多个种子查询和查询模式来建立训练数据。 可以使用种子查询和查询模式信息来构建双分图。 可以执行双分图的遍历以扩展训练数据以包括足够的数据来执行本次搜索任务的分类。

    BUILD OF WEBSITE KNOWLEDGE TABLES
    3.
    发明申请
    BUILD OF WEBSITE KNOWLEDGE TABLES 审中-公开
    建立网站知识表

    公开(公告)号:US20120284224A1

    公开(公告)日:2012-11-08

    申请号:US13100305

    申请日:2011-05-04

    IPC分类号: G06F17/30

    CPC分类号: G06F16/951

    摘要: Architecture that defines domain knowledge on networks, such as the Internet, as tables where each row is an entity in the target domain and each column is an attribute of these entities. The corresponding values for entity-attribute pairs are the domain knowledge. The architecture provides semi-automatic and systematic ways to extract network knowledge from at least an unstructured and semi-structured network (the Internet), structuralizes the knowledge in table format, and uses the domain tables to build the online updated knowledge base.

    摘要翻译: 在网络上定义领域知识(例如Internet)的架构,作为每个行是目标域中的实体的表,每列是这些实体的属性。 实体 - 属性对的相应值是域知识。 该架构提供半自动和系统的方法,从至少一个非结构化和半结构化网络(Internet)中提取网络知识,将表格格式的知识结构化,并使用域表构建在线更新的知识库。

    Learning latent semantic space for ranking
    4.
    发明授权
    Learning latent semantic space for ranking 有权
    学习潜在语义空间进行排名

    公开(公告)号:US08239334B2

    公开(公告)日:2012-08-07

    申请号:US12344093

    申请日:2008-12-24

    CPC分类号: G06F17/30675

    摘要: A tool facilitating learning latent semantics for ranking (LLSR) tailored to the ranking task via leveraging relevance information of query-document pairs to learn a tailored latent semantic space such that other documents are better ranked for the queries in the subspace. The tool applying a learning latent semantics for ranking algorithm integrating LLSR, thereby enabling learning an optimal latent semantic space (LSS) for ranking by utilizing relevance information in the training process of subspace learning. The tool enabling an optimization of the LSS as a closed form solution and facilitating reporting the learned LSS.

    摘要翻译: 一种通过利用查询文档对的相关性信息来学习定制的潜在语义空间,使其他文档更好地排列在子空间中的查询的方法,帮助学习用于排名任务的潜在语义(LLSR)。 该工具应用学习潜在语义用于整合LLSR的排序算法,从而通过在子空间学习的训练过程中利用相关性信息来学习优化潜在语义空间(LSS)进行排名。 该工具可以将LSS优化为封闭式解决方案,并有助于报告所学习的LSS。

    CATEGORIZING ONLINE USER BEHAVIOR DATA
    5.
    发明申请
    CATEGORIZING ONLINE USER BEHAVIOR DATA 审中-公开
    分类在线用户行为数据

    公开(公告)号:US20110077998A1

    公开(公告)日:2011-03-31

    申请号:US12568707

    申请日:2009-09-29

    IPC分类号: G06Q10/00 G06Q30/00 G06F17/30

    CPC分类号: G06Q30/02

    摘要: A method for categorizing online user behavior data, including creating a target set of users based on an advertiser query, identifying two or more users in the target set having one or more first similar behavior attributes using a Minhash algorithm; and modifying the target set according to the two or more identified users.

    摘要翻译: 一种用于对在线用户行为数据进行分类的方法,包括基于广告商查询创建目标用户集合,使用Minhash算法识别具有一个或多个第一相似行为属性的目标集合中的两个或多个用户; 以及根据所述两个或多个识别的用户修改所述目标集合。

    Transfer of learning for query classification
    6.
    发明授权
    Transfer of learning for query classification 有权
    转移学习查询分类

    公开(公告)号:US08719192B2

    公开(公告)日:2014-05-06

    申请号:US13081391

    申请日:2011-04-06

    IPC分类号: G06N5/02 G06F17/30

    CPC分类号: G06N99/005

    摘要: Transfer of learning trains a new domain for the classification of search queries according to different tasks, as well as the generation of a corresponding domain-specific query classifier that may be used to classify the search queries according to the different tasks in the new domain. The transfer of learning may include preparing a new domain to receive classification knowledge from one or more source domains by populating the new domain with preliminary query patterns extracted for a search engine log. The transfer of learning may further include preparing the classification knowledge in each source domain for transfer to the new domain. The classification knowledge in each source domain may then be transferred to the new domain.

    摘要翻译: 学习的转移为根据不同任务对搜索查询进行分类的新领域提供了新的领域,以及生成可用于根据新域中的不同任务对搜索查询进行分类的相应的域特定查询分类器。 学习的转移可能包括准备一个新的域,以通过用搜索引擎日志提取的初步查询模式填充新域来从一个或多个源域接收分类知识。 学习的转移还可以包括准备每个源域中的分类知识以转移到新的域。 然后可以将每个源域中的分类知识转移到新域。

    Related links recommendation
    7.
    发明授权
    Related links recommendation 有权
    相关链接推荐

    公开(公告)号:US08412726B2

    公开(公告)日:2013-04-02

    申请号:US12793047

    申请日:2010-06-03

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30867

    摘要: The related links recommendation technique described herein employs combined collaborative filtering to recommend related web pages to users. The technique creates multiple collaborative filters which are combined in order to create a combined collaborative filter to recommend web pages similar to a given web page to a user. One query-based collaborative filter is created by using query search clicks (e.g., user input device selection actions on search results returned in response to a search query). Another user-behavior-based collaborative filter is created by using query search clicks and user clicks while browsing websites (e.g., user input device selection actions while a user is browsing websites). Lastly, another content-based collaborative filter based on similar content of web pages is created by finding web pages with similar content.

    摘要翻译: 本文描述的相关链接推荐技术采用组合协同过滤来向用户推荐相关网页。 该技术创建了多个协作过滤器,这些过滤器被组合以便创建组合的协同过滤器以向用户推荐类似于给定网页的网页。 通过使用查询搜索点击创建一个基于查询的协作过滤器(例如,响应于搜索查询返回的搜索结果上的用户输入设备选择动作)。 通过在浏览网站时使用查询搜索点击和用户点击创建另一个基于用户行为的协作过滤器(例如,用户浏览网站时的用户输入设备选择动作)。 最后,通过查找具有相似内容的网页来创建基于类似内容的网页的另一基于内容的协作过滤器。

    Indexing Semantic User Profiles for Targeted Advertising
    8.
    发明申请
    Indexing Semantic User Profiles for Targeted Advertising 有权
    索引目标广告的语义用户个人资料

    公开(公告)号:US20130073546A1

    公开(公告)日:2013-03-21

    申请号:US13235140

    申请日:2011-09-16

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30321 G06F17/30867

    摘要: Embodiments facilitate greater flexibility in definition of user segments for targeted advertising, by employing indexed semantic user profiles. Semantic user profiles are built through extraction of online user behavior data such as user search queries and page views, and include user interest information that is inferred based on user behavior. Semantic user profiles are then indexed to facilitate search for a set of users that fit specified semantic search terms. Search results for semantic profiles are ranked according to a ranking model developed through machine learning. In some embodiments, building and indexing of semantic profiles and learning of the ranking model is performed offline to facilitate more efficient online processing of queries.

    摘要翻译: 实施例通过采用索引语义用户简档来促进用于定向广告的用户段的定义的更大的灵活性。 通过提取在线用户行为数据(如用户搜索查询和页面浏览)构建语义用户配置文件,并包括基于用户行为推断的用户兴趣信息。 然后索引语义用户简档,以便于搜索适合指定语义搜索术语的一组用户。 根据通过机器学习开发的排名模型对语义轮廓的搜索结果进行排名。 在一些实施例中,离线地执行语义概况的构建和索引以及排名模型的学习,以便更有效地在线处理查询。

    Clustering aggregator for RSS feeds
    9.
    发明授权
    Clustering aggregator for RSS feeds 有权
    用于RSS源的聚类聚合器

    公开(公告)号:US07958125B2

    公开(公告)日:2011-06-07

    申请号:US12146481

    申请日:2008-06-26

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30705

    摘要: A method for merging really simple syndication (RSS) feeds. Stories containing one or more terms may be merged into one or more clusters based on one or more links between the stories. A cluster frequency with which the terms occur in each cluster may be determined. A diameter for each cluster may be determined. A cluster that is most similar to one of the clusters may be determined based on the cluster frequency. The most similar cluster with the one of the clusters may be determined based on each diameter, and each cluster frequency.

    摘要翻译: 一种合并真正简单的联合(RSS)馈送的方法。 包含一个或多个术语的故事可以基于故事之间的一个或多个链接合并成一个或多个集群。 可以确定在每个簇中出现术语的聚类频率。 可以确定每个簇的直径。 可以基于群集频率来确定与簇之一最相似的群集。 可以基于每个直径和每个聚类频率来确定具有一个簇的最相似的簇。