RELATED LINKS RECOMMENDATION
    31.
    发明申请
    RELATED LINKS RECOMMENDATION 有权
    相关链接建议

    公开(公告)号:US20110302155A1

    公开(公告)日:2011-12-08

    申请号:US12793047

    申请日:2010-06-03

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30867

    摘要: The related links recommendation technique described herein employs combined collaborative filtering to recommend related web pages to users. The technique creates multiple collaborative filters which are combined in order to create a combined collaborative filter to recommend web pages similar to a given web page to a user. One query-based collaborative filter is created by using query search clicks (e.g., user input device selection actions on search results returned in response to a search query). Another user-behavior-based collaborative filter is created by using query search clicks and user clicks while browsing websites (e.g., user input device selection actions while a user is browsing websites). Lastly, another content-based collaborative filter based on similar content of web pages is created by finding web pages with similar content.

    摘要翻译: 本文描述的相关链接推荐技术采用组合协同过滤来向用户推荐相关网页。 该技术创建了多个协作过滤器,这些过滤器被组合以便创建组合的协同过滤器以向用户推荐类似于给定网页的网页。 通过使用查询搜索点击创建一个基于查询的协作过滤器(例如,响应于搜索查询返回的搜索结果上的用户输入设备选择动作)。 通过在浏览网站时使用查询搜索点击和用户点击创建另一个基于用户行为的协作过滤器(例如,用户浏览网站时的用户输入设备选择动作)。 最后,通过查找具有相似内容的网页来创建基于类似内容的网页的另一基于内容的协作过滤器。

    Scalable Parallel User Clustering in Discrete Time Window
    32.
    发明申请
    Scalable Parallel User Clustering in Discrete Time Window 审中-公开
    离散时间窗口中可扩展的并行用户群集

    公开(公告)号:US20100169258A1

    公开(公告)日:2010-07-01

    申请号:US12346881

    申请日:2008-12-31

    IPC分类号: G06N5/02 G06F7/06 G06F17/30

    CPC分类号: G06F16/9535

    摘要: Described is an internet user clustering technology, such as useful in behavioral targeting, in which users are clustered together based on MinHash computations that produce signatures corresponding to users' internet-related activities. In one aspect, users are clustered together based on commonality of signatures between each set of signatures associated with each user. The signature sets and/or clusters may be associated with timestamps, whereby clusters may be determined for a given discrete time window or set of discrete time windows. To facilitate efficient processing, existing, prior signature sets of a user may be incrementally updated (e.g., daily), and/or the MinHash computations for users are partitioned among parallel computing machines. The timestamps may be used to selectively determine a cluster within a continuous time, a time window or set of time windows.

    摘要翻译: 描述了一种互联网用户聚类技术,例如在行为定位中是有用的,其中基于MinHash计算将用户聚集在一起,该计算产生对应于用户的互联网相关活动的签名。 在一个方面,用户基于与每个用户相关联的每组签名之间的签名的共性来聚集在一起。 签名集合和/或聚类可以与时间戳相关联,由此可以针对给定的离散时间窗口或一组离散时间窗口确定聚类。 为了促进有效的处理,用户的现有的先前签名集可以被递增地更新(例如,每天),和/或用于用户的MinHash计算在并行计算机之间被划分。 时间戳可以用于在连续时间,时间窗口或一组时间窗口内选择性地确定群集。

    CLUSTERING AGGREGATOR FOR RSS FEEDS
    33.
    发明申请
    CLUSTERING AGGREGATOR FOR RSS FEEDS 有权
    聚合聚合器RSS信息

    公开(公告)号:US20090327320A1

    公开(公告)日:2009-12-31

    申请号:US12146481

    申请日:2008-06-26

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30705

    摘要: A method for merging really simple syndication (RSS) feeds. Stories containing one or more terms may be merged into one or more clusters based on one or more links between the stories. A cluster frequency with which the terms occur in each cluster may be determined. A diameter for each cluster may be determined. A cluster that is most similar to one of the clusters may be determined based on the cluster frequency. The most similar cluster with the one of the clusters may be determined based on each diameter, and each cluster frequency.

    摘要翻译: 一种合并真正简单的联合(RSS)馈送的方法。 包含一个或多个术语的故事可以基于故事之间的一个或多个链接合并成一个或多个集群。 可以确定在每个簇中出现术语的聚类频率。 可以确定每个簇的直径。 可以基于群集频率来确定与簇之一最相似的群集。 可以基于每个直径和每个聚类频率来确定具有一个簇的最相似的簇。

    Transfer of learning for query classification
    34.
    发明授权
    Transfer of learning for query classification 有权
    转移学习查询分类

    公开(公告)号:US08719192B2

    公开(公告)日:2014-05-06

    申请号:US13081391

    申请日:2011-04-06

    IPC分类号: G06N5/02 G06F17/30

    CPC分类号: G06N99/005

    摘要: Transfer of learning trains a new domain for the classification of search queries according to different tasks, as well as the generation of a corresponding domain-specific query classifier that may be used to classify the search queries according to the different tasks in the new domain. The transfer of learning may include preparing a new domain to receive classification knowledge from one or more source domains by populating the new domain with preliminary query patterns extracted for a search engine log. The transfer of learning may further include preparing the classification knowledge in each source domain for transfer to the new domain. The classification knowledge in each source domain may then be transferred to the new domain.

    摘要翻译: 学习的转移为根据不同任务对搜索查询进行分类的新领域提供了新的领域,以及生成可用于根据新域中的不同任务对搜索查询进行分类的相应的域特定查询分类器。 学习的转移可能包括准备一个新的域,以通过用搜索引擎日志提取的初步查询模式填充新域来从一个或多个源域接收分类知识。 学习的转移还可以包括准备每个源域中的分类知识以转移到新的域。 然后可以将每个源域中的分类知识转移到新域。

    Related links recommendation
    35.
    发明授权
    Related links recommendation 有权
    相关链接推荐

    公开(公告)号:US08412726B2

    公开(公告)日:2013-04-02

    申请号:US12793047

    申请日:2010-06-03

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30867

    摘要: The related links recommendation technique described herein employs combined collaborative filtering to recommend related web pages to users. The technique creates multiple collaborative filters which are combined in order to create a combined collaborative filter to recommend web pages similar to a given web page to a user. One query-based collaborative filter is created by using query search clicks (e.g., user input device selection actions on search results returned in response to a search query). Another user-behavior-based collaborative filter is created by using query search clicks and user clicks while browsing websites (e.g., user input device selection actions while a user is browsing websites). Lastly, another content-based collaborative filter based on similar content of web pages is created by finding web pages with similar content.

    摘要翻译: 本文描述的相关链接推荐技术采用组合协同过滤来向用户推荐相关网页。 该技术创建了多个协作过滤器,这些过滤器被组合以便创建组合的协同过滤器以向用户推荐类似于给定网页的网页。 通过使用查询搜索点击创建一个基于查询的协作过滤器(例如,响应于搜索查询返回的搜索结果上的用户输入设备选择动作)。 通过在浏览网站时使用查询搜索点击和用户点击创建另一个基于用户行为的协作过滤器(例如,用户浏览网站时的用户输入设备选择动作)。 最后,通过查找具有相似内容的网页来创建基于类似内容的网页的另一基于内容的协作过滤器。

    Indexing Semantic User Profiles for Targeted Advertising
    36.
    发明申请
    Indexing Semantic User Profiles for Targeted Advertising 有权
    索引目标广告的语义用户个人资料

    公开(公告)号:US20130073546A1

    公开(公告)日:2013-03-21

    申请号:US13235140

    申请日:2011-09-16

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30321 G06F17/30867

    摘要: Embodiments facilitate greater flexibility in definition of user segments for targeted advertising, by employing indexed semantic user profiles. Semantic user profiles are built through extraction of online user behavior data such as user search queries and page views, and include user interest information that is inferred based on user behavior. Semantic user profiles are then indexed to facilitate search for a set of users that fit specified semantic search terms. Search results for semantic profiles are ranked according to a ranking model developed through machine learning. In some embodiments, building and indexing of semantic profiles and learning of the ranking model is performed offline to facilitate more efficient online processing of queries.

    摘要翻译: 实施例通过采用索引语义用户简档来促进用于定向广告的用户段的定义的更大的灵活性。 通过提取在线用户行为数据(如用户搜索查询和页面浏览)构建语义用户配置文件,并包括基于用户行为推断的用户兴趣信息。 然后索引语义用户简档,以便于搜索适合指定语义搜索术语的一组用户。 根据通过机器学习开发的排名模型对语义轮廓的搜索结果进行排名。 在一些实施例中,离线地执行语义概况的构建和索引以及排名模型的学习,以便更有效地在线处理查询。

    Clustering aggregator for RSS feeds
    37.
    发明授权
    Clustering aggregator for RSS feeds 有权
    用于RSS源的聚类聚合器

    公开(公告)号:US07958125B2

    公开(公告)日:2011-06-07

    申请号:US12146481

    申请日:2008-06-26

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30705

    摘要: A method for merging really simple syndication (RSS) feeds. Stories containing one or more terms may be merged into one or more clusters based on one or more links between the stories. A cluster frequency with which the terms occur in each cluster may be determined. A diameter for each cluster may be determined. A cluster that is most similar to one of the clusters may be determined based on the cluster frequency. The most similar cluster with the one of the clusters may be determined based on each diameter, and each cluster frequency.

    摘要翻译: 一种合并真正简单的联合(RSS)馈送的方法。 包含一个或多个术语的故事可以基于故事之间的一个或多个链接合并成一个或多个集群。 可以确定在每个簇中出现术语的聚类频率。 可以确定每个簇的直径。 可以基于群集频率来确定与簇之一最相似的群集。 可以基于每个直径和每个聚类频率来确定具有一个簇的最相似的簇。

    Random walk on query pattern graph for query task classification
    38.
    发明授权
    Random walk on query pattern graph for query task classification 有权
    随机走在查询模式图上进行查询任务分类

    公开(公告)号:US08838512B2

    公开(公告)日:2014-09-16

    申请号:US13089103

    申请日:2011-04-18

    IPC分类号: G06F7/00 G06N5/02 G06F17/30

    CPC分类号: G06F17/30864

    摘要: A classification process may reduce the computational resources and time required to collect and classify training data utilized to enable a user to effectively access online information. According to some implementations, training data is established by defining one or more seed queries and query patterns. A bi-partite graph may be constructed using the seed query and query pattern information. A traversal of the bi-partite graph can be performed to expand the training data to encompass sufficient data to perform classification of the present search task.

    摘要翻译: 分类过程可以减少收集和分类用于使用户有效访问在线信息的训练数据所需的计算资源和时间。 根据一些实施方式,通过定义一个或多个种子查询和查询模式来建立训练数据。 可以使用种子查询和查询模式信息来构建双分图。 可以执行双分图的遍历以扩展训练数据以包括足够的数据来执行本次搜索任务的分类。

    FRAMEWORK FOR DOCUMENT KNOWLEDGE EXTRACTION
    39.
    发明申请
    FRAMEWORK FOR DOCUMENT KNOWLEDGE EXTRACTION 审中-公开
    文件知识提取框架

    公开(公告)号:US20130246435A1

    公开(公告)日:2013-09-19

    申请号:US13419690

    申请日:2012-03-14

    IPC分类号: G06F17/30

    CPC分类号: G06F16/355

    摘要: A knowledge extraction framework may iteratively enrich an ontology that is used to classify structured knowledge obtained from web pages based on structured knowledge previously acquired from other web pages. The framework may enable a user to define the ontology for extracting structured knowledge from a plurality of web pages. The framework applies the ontology using a supervised extraction algorithm to extract seed information from a set of web pages. The framework further applies an unsupervised extraction algorithm to extract the structured knowledge from an additional set of web pages. The framework subsequently maps the structured knowledge to the ontology based on the seed information to enrich the ontology.

    摘要翻译: 知识提取框架可以迭代地丰富用于基于先前从其他网页获取的结构化知识对从网页获得的结构化知识进行分类的本体。 框架可以使用户能够定义用于从多个网页提取结构化知识的本体。 该框架使用监督提取算法应用本体,从一组网页中提取种子信息。 该框架进一步应用无监督提取算法从一组额外的网页提取结构化知识。 该框架随后基于种子信息将结构化知识映射到本体,以丰富本体。

    BUILD OF WEBSITE KNOWLEDGE TABLES
    40.
    发明申请
    BUILD OF WEBSITE KNOWLEDGE TABLES 审中-公开
    建立网站知识表

    公开(公告)号:US20120284224A1

    公开(公告)日:2012-11-08

    申请号:US13100305

    申请日:2011-05-04

    IPC分类号: G06F17/30

    CPC分类号: G06F16/951

    摘要: Architecture that defines domain knowledge on networks, such as the Internet, as tables where each row is an entity in the target domain and each column is an attribute of these entities. The corresponding values for entity-attribute pairs are the domain knowledge. The architecture provides semi-automatic and systematic ways to extract network knowledge from at least an unstructured and semi-structured network (the Internet), structuralizes the knowledge in table format, and uses the domain tables to build the online updated knowledge base.

    摘要翻译: 在网络上定义领域知识(例如Internet)的架构,作为每个行是目标域中的实体的表,每列是这些实体的属性。 实体 - 属性对的相应值是域知识。 该架构提供半自动和系统的方法,从至少一个非结构化和半结构化网络(Internet)中提取网络知识,将表格格式的知识结构化,并使用域表构建在线更新的知识库。