LEARNING USER INTENT FROM RULE-BASED TRAINING DATA
    1.
    发明申请
    LEARNING USER INTENT FROM RULE-BASED TRAINING DATA 审中-公开
    从基于规则的培训数据学习用户信息

    公开(公告)号:US20110289025A1

    公开(公告)日:2011-11-24

    申请号:US12783457

    申请日:2010-05-19

    IPC分类号: G06F15/18 G06N5/02

    CPC分类号: G06N5/025 G06N20/00

    摘要: The search intent co-learning technique described herein learns user search intents from rule-based training data and denoises and debiases this data. The technique generates several sets of biased and noisy training data using different rules. It trains each of a set of classifiers using different training data sets independently. The classifiers are then used to categorize the training data as well as any unlabeled data. The classified data confidently classified by one classifier is added to other training data sets, and the wrongly classified data is filtered out from the training data sets, so as to create an accurate training data set with which to train a classifier to learn a user's intent for submitting a search query string or targeting a user for on-line advertising based on user behavior.

    摘要翻译: 本文描述的搜索意图共同学习技术从基于规则的训练数据中学习用户搜索意图,并对该数据进行去噪和去噪。 该技术使用不同的规则产生几组偏倚和嘈杂的训练数据。 它使用不同的训练数据集来独立地训练一组分类器中的每一个。 然后,分类器用于对训练数据以及任何未标记的数据进行分类。 通过一个分类器自信分类的分类数据被添加到其他训练数据集,并且从训练数据集中过滤出错误分类的数据,以便创建准确的训练数据集,以训练分类器来学习用户的意图 用于根据用户行为提交搜索查询字符串或定位用户进行在线广告。

    IDENTIFICATION OF SIMILAR QUERIES BASED ON OVERALL AND PARTIAL SIMILARITY OF TIME SERIES
    2.
    发明申请
    IDENTIFICATION OF SIMILAR QUERIES BASED ON OVERALL AND PARTIAL SIMILARITY OF TIME SERIES 有权
    基于时间序列的整体和部分相似性识别类似的查询

    公开(公告)号:US20090006365A1

    公开(公告)日:2009-01-01

    申请号:US11770505

    申请日:2007-06-28

    IPC分类号: G06F7/00

    CPC分类号: G06F17/30864 G06F17/3064

    摘要: Techniques for identifying similar queries based on their overall similarity and partial similarity of time series of frequencies of the queries are provided. To identify queries that are similar to a target query, the query analysis system generates, for each query, an overall similarity score for that query and the target query based on the time series of the query and the target query. The query analysis system also generates, for each query, partial similarity scores for the query and the target query based on various time sub-series of the overall time series of the queries. The query analysis system then identifies queries as being similar to the target query based on the overall similarity scores and the partial similarity scores of the queries.

    摘要翻译: 提供了基于其查询的时间序列的总体相似性和部分相似性来识别类似查询的技术。 为了识别类似于目标查询的查询,查询分析系统根据查询和目标查询的时间序列为每个查询生成该查询和目标查询的总体相似性得分。 查询分析系统还根据查询的整个时间序列的各种时间子序列,为每个查询生成查询和目标查询的部分相似度分数。 然后,查询分析系统基于查询的总体相似性得分和部分相似性得分将查询识别为与目标查询相似。

    REPRESENTING QUERIES AND DETERMINING SIMILARITY BASED ON AN ARIMA MODEL
    3.
    发明申请
    REPRESENTING QUERIES AND DETERMINING SIMILARITY BASED ON AN ARIMA MODEL 失效
    基于ARIMA模型表示查询和确定相似度

    公开(公告)号:US20090006326A1

    公开(公告)日:2009-01-01

    申请号:US11770307

    申请日:2007-06-28

    IPC分类号: G06F17/30

    CPC分类号: G06Q30/02

    摘要: Representing queries and determining similarity of queries based on an autoregressive integrated moving average (“ARIMA”) model is provided. A query analysis system represents each query by its ARIMA coefficients. The query analysis system may estimate the frequency information for a desired past or future interval based on frequency information for some initial intervals. The query analysis system may also determine the similarity of a pair of queries based on the similarity of their ARIMA coefficients. The query analysis system may use various metrics, such as a correlation metric, to determine the similarity of the ARIMA coefficients.

    摘要翻译: 提供了基于自回归综合移动平均(“ARIMA”)模型的查询和确定查询的相似性。 查询分析系统通过其ARIMA系数表示每个查询。 查询分析系统可以基于一些初始间隔的频率信息来估计期望的过去或将来间隔的频率信息。 查询分析系统还可以基于它们的ARIMA系数的相似度来确定一对查询的相似性。 查询分析系统可以使用诸如相关度量的各种度量来确定ARIMA系数的相似性。

    FORECASTING SEARCH QUERIES BASED ON TIME DEPENDENCIES
    4.
    发明申请
    FORECASTING SEARCH QUERIES BASED ON TIME DEPENDENCIES 有权
    根据时间依赖性预测搜索查询

    公开(公告)号:US20090006313A1

    公开(公告)日:2009-01-01

    申请号:US11770462

    申请日:2007-06-28

    IPC分类号: G06F17/40

    CPC分类号: G06Q30/02

    摘要: Techniques for analyzing and modeling the frequency of queries are provided by a query analysis system. A query analysis system analyzes frequencies of a query over time to determine whether the query is time-dependent or time-independent. The query analysis system forecasts the frequency of time-dependent queries based on their periodicities. The query analysis system forecasts the frequency of time-independent queries based on causal relationships with other queries. To forecast the frequency of time-independent queries, the query analysis system analyzes the frequency of a query over time to identify significant increases in the frequency, which are referred to as “query events” or “events.” The query analysis system forecasts frequencies of time-independent queries based on queries with events that tend to causally precede events of the query to be forecasted.

    摘要翻译: 用于分析和建模查询频率的技术由查询分析系统提供。 查询分析系统分析查询的频率,以确定查询是时间依赖还是时间无关。 查询分析系统根据其周期性预测与时间相关的查询的频率。 查询分析系统根据与其他查询的因果关系预测与时间无关的查询的频率。 为了预测与时间无关的查询的频率,查询分析系统会随着时间的推移分析查询的频率,以确定频率的显着增加,这被称为“查询事件”或“事件”。 查询分析系统基于具有事件倾向于在要预测的查询的事件之前的查询来预测与时间无关的查询的频率。

    DETERMINATION OF TIME DEPENDENCY OF SEARCH QUERIES
    5.
    发明申请
    DETERMINATION OF TIME DEPENDENCY OF SEARCH QUERIES 失效
    确定搜索查询的时间依赖关系

    公开(公告)号:US20090006312A1

    公开(公告)日:2009-01-01

    申请号:US11770358

    申请日:2007-06-28

    IPC分类号: G06F7/00

    CPC分类号: G06F17/30864 G06Q30/02

    摘要: Techniques for analyzing and modeling the frequency of queries are provided by a query analysis system. A query analysis system analyzes frequencies of a query over time to determine whether the query is time-dependent or time-independent. The query analysis system forecasts the frequency of time-dependent queries based on their periodicities. The query analysis system forecasts the frequency of time-independent queries based on causal relationships with other queries. To forecast the frequency of time-independent queries, the query analysis system analyzes the frequency of a query over time to identify significant increases in the frequency, which are referred to as “query events” or “events.” The query analysis system forecasts frequencies of time-independent queries based on queries with events that tend to causally precede events of the query to be forecasted.

    摘要翻译: 用于分析和建模查询频率的技术由查询分析系统提供。 查询分析系统分析查询的频率,以确定查询是时间依赖还是时间无关。 查询分析系统根据其周期性预测与时间相关的查询的频率。 查询分析系统根据与其他查询的因果关系预测与时间无关的查询的频率。 为了预测与时间无关的查询的频率,查询分析系统会随着时间的推移分析查询的频率,以确定频率的显着增加,这被称为“查询事件”或“事件”。 查询分析系统基于具有事件倾向于在要预测的查询的事件之前的查询来预测与时间无关的查询的频率。

    Smart user-centric information aggregation
    7.
    发明授权
    Smart user-centric information aggregation 有权
    智能用户为中心的信息聚合

    公开(公告)号:US08868598B2

    公开(公告)日:2014-10-21

    申请号:US13586711

    申请日:2012-08-15

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30032 G06F17/30905

    摘要: A smart user-centric information aggregation system allows a user to define a region of content displayed in a display of a device and performs information aggregation on behalf of the user. The smart user-centric information aggregation system searches, aggregates and groups information related to content included in the region of content for the user while the user can continue to perform his/her original course of actions without interruption. After finding information related to the desired content, the smart user-centric information aggregation system may notify the user and present the found information to the user upon receiving confirmation from the user. The smart user-centric information aggregation system may continue to find new related information and update the presentation with the newly found information periodically, in some instances without user intervention or input.

    摘要翻译: 以智能用户为中心的信息聚合系统允许用户定义显示在设备显示器中的内容区域,并代表用户执行信息聚合。 智能用户为中心的信息聚合系统在用户可以继续执行他/她的原始行为过程而不间断地搜索,聚合和分组与用户内容区域中包含的内容相关的信息。 在找到与期望内容相关的信息之后,智能用户为中心的信息聚合系统可以在接收到来自用户的确认时通知用户并向用户呈现找到的信息。 以智能用户为中心的信息聚合系统可以继续寻找新的相关信息,并且在某些情况下,不需要用户干预或输入,定期更新新发现的信息。

    Web Knowledge Extraction for Search Task Simplification
    8.
    发明申请
    Web Knowledge Extraction for Search Task Simplification 有权
    Web知识提取搜索任务简化

    公开(公告)号:US20130138655A1

    公开(公告)日:2013-05-30

    申请号:US13307836

    申请日:2011-11-30

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30702 G06F17/30867

    摘要: Techniques are described for generating structured information from semi-structured web pages, and retrieving the structured knowledge in response to a user query that indicates a query intent. The structured information is automatically extracted offline from semi-structured web pages, through the use of an auto wrapper solution that is noise tolerant, scalable, and automatic. The structured information is stored in a knowledge base, and provided in response to a user search query that indicates a query intent. Extraction of structured information may also include clustering of pages based on their measured similarities. The clusters may be determined based on similar elements in the tag path text data of the pages. A minimum size threshold may be applied to the clusters.

    摘要翻译: 描述了用于从半结构化网页生成结构化信息的技术,以及响应于指示查询意图的用户查询来检索结构化知识。 结构化信息通过使用具有噪声容限,可扩展和自动的自动包装解决方案,从半结构化网页离线自动提取。 结构化信息存储在知识库中,并响应于指示查询意图的用户搜索查询而提供。 结构化信息的提取还可以包括基于其测量的相似性来聚合页面。 可以基于页面的标签路径文本数据中的类似元素来确定簇。 可以将最小大小阈值应用于群集。

    Identification of similar queries based on overall and partial similarity of time series
    9.
    发明授权
    Identification of similar queries based on overall and partial similarity of time series 有权
    基于时间序列的总体和部分相似性识别类似查询

    公开(公告)号:US08290921B2

    公开(公告)日:2012-10-16

    申请号:US11770505

    申请日:2007-06-28

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30864 G06F17/3064

    摘要: Techniques for identifying similar queries based on their overall similarity and partial similarity of time series of frequencies of the queries are provided. To identify queries that are similar to a target query, the query analysis system generates, for each query, an overall similarity score for that query and the target query based on the time series of the query and the target query. The query analysis system also generates, for each query, partial similarity scores for the query and the target query based on various time sub-series of the overall time series of the queries. The query analysis system then identifies queries as being similar to the target query based on the overall similarity scores and the partial similarity scores of the queries.

    摘要翻译: 提供了基于其查询的时间序列的总体相似性和部分相似性来识别类似查询的技术。 为了识别类似于目标查询的查询,查询分析系统根据查询和目标查询的时间序列为每个查询生成该查询和目标查询的总体相似性得分。 查询分析系统还根据查询的整个时间序列的各种时间子序列,为每个查询生成查询和目标查询的部分相似度分数。 然后,查询分析系统基于查询的总体相似性得分和部分相似性得分将查询识别为与目标查询相似。

    TRANSFER OF LEARNING FOR QUERY CLASSIFICATION
    10.
    发明申请
    TRANSFER OF LEARNING FOR QUERY CLASSIFICATION 有权
    转学习查询分类

    公开(公告)号:US20120259801A1

    公开(公告)日:2012-10-11

    申请号:US13081391

    申请日:2011-04-06

    IPC分类号: G06F15/18

    CPC分类号: G06N99/005

    摘要: Transfer of learning trains a new domain for the classification of search queries according to different tasks, as well as the generation of a corresponding domain-specific query classifier that may be used to classify the search queries according to the different tasks in the new domain. The transfer of learning may include preparing a new domain to receive classification knowledge from one or more source domains by populating the new domain with preliminary query patterns extracted for a search engine log. The transfer of learning may further include preparing the classification knowledge in each source domain for transfer to the new domain. The classification knowledge in each source domain may then be transferred to the new domain.

    摘要翻译: 学习的转移为根据不同任务对搜索查询进行分类的新领域提供了新的领域,以及生成可用于根据新域中的不同任务对搜索查询进行分类的相应的域特定查询分类器。 学习的转移可能包括准备一个新的域,以通过用搜索引擎日志提取的初步查询模式填充新域来从一个或多个源域接收分类知识。 学习的转移还可以包括准备每个源域中的分类知识以转移到新的域。 然后可以将每个源域中的分类知识转移到新域。