Learning latent semantic space for ranking
    21.
    发明授权
    Learning latent semantic space for ranking 有权
    学习潜在语义空间进行排名

    公开(公告)号:US08239334B2

    公开(公告)日:2012-08-07

    申请号:US12344093

    申请日:2008-12-24

    CPC分类号: G06F17/30675

    摘要: A tool facilitating learning latent semantics for ranking (LLSR) tailored to the ranking task via leveraging relevance information of query-document pairs to learn a tailored latent semantic space such that other documents are better ranked for the queries in the subspace. The tool applying a learning latent semantics for ranking algorithm integrating LLSR, thereby enabling learning an optimal latent semantic space (LSS) for ranking by utilizing relevance information in the training process of subspace learning. The tool enabling an optimization of the LSS as a closed form solution and facilitating reporting the learned LSS.

    摘要翻译: 一种通过利用查询文档对的相关性信息来学习定制的潜在语义空间,使其他文档更好地排列在子空间中的查询的方法,帮助学习用于排名任务的潜在语义(LLSR)。 该工具应用学习潜在语义用于整合LLSR的排序算法,从而通过在子空间学习的训练过程中利用相关性信息来学习优化潜在语义空间(LSS)进行排名。 该工具可以将LSS优化为封闭式解决方案,并有助于报告所学习的LSS。

    CATEGORIZING ONLINE USER BEHAVIOR DATA
    22.
    发明申请
    CATEGORIZING ONLINE USER BEHAVIOR DATA 审中-公开
    分类在线用户行为数据

    公开(公告)号:US20110077998A1

    公开(公告)日:2011-03-31

    申请号:US12568707

    申请日:2009-09-29

    IPC分类号: G06Q10/00 G06Q30/00 G06F17/30

    CPC分类号: G06Q30/02

    摘要: A method for categorizing online user behavior data, including creating a target set of users based on an advertiser query, identifying two or more users in the target set having one or more first similar behavior attributes using a Minhash algorithm; and modifying the target set according to the two or more identified users.

    摘要翻译: 一种用于对在线用户行为数据进行分类的方法,包括基于广告商查询创建目标用户集合,使用Minhash算法识别具有一个或多个第一相似行为属性的目标集合中的两个或多个用户; 以及根据所述两个或多个识别的用户修改所述目标集合。

    LEARNING USER INTENT FROM RULE-BASED TRAINING DATA
    23.
    发明申请
    LEARNING USER INTENT FROM RULE-BASED TRAINING DATA 审中-公开
    从基于规则的培训数据学习用户信息

    公开(公告)号:US20110289025A1

    公开(公告)日:2011-11-24

    申请号:US12783457

    申请日:2010-05-19

    IPC分类号: G06F15/18 G06N5/02

    CPC分类号: G06N5/025 G06N20/00

    摘要: The search intent co-learning technique described herein learns user search intents from rule-based training data and denoises and debiases this data. The technique generates several sets of biased and noisy training data using different rules. It trains each of a set of classifiers using different training data sets independently. The classifiers are then used to categorize the training data as well as any unlabeled data. The classified data confidently classified by one classifier is added to other training data sets, and the wrongly classified data is filtered out from the training data sets, so as to create an accurate training data set with which to train a classifier to learn a user's intent for submitting a search query string or targeting a user for on-line advertising based on user behavior.

    摘要翻译: 本文描述的搜索意图共同学习技术从基于规则的训练数据中学习用户搜索意图,并对该数据进行去噪和去噪。 该技术使用不同的规则产生几组偏倚和嘈杂的训练数据。 它使用不同的训练数据集来独立地训练一组分类器中的每一个。 然后,分类器用于对训练数据以及任何未标记的数据进行分类。 通过一个分类器自信分类的分类数据被添加到其他训练数据集,并且从训练数据集中过滤出错误分类的数据,以便创建准确的训练数据集,以训练分类器来学习用户的意图 用于根据用户行为提交搜索查询字符串或定位用户进行在线广告。

    IDENTIFICATION OF SIMILAR QUERIES BASED ON OVERALL AND PARTIAL SIMILARITY OF TIME SERIES
    24.
    发明申请
    IDENTIFICATION OF SIMILAR QUERIES BASED ON OVERALL AND PARTIAL SIMILARITY OF TIME SERIES 有权
    基于时间序列的整体和部分相似性识别类似的查询

    公开(公告)号:US20090006365A1

    公开(公告)日:2009-01-01

    申请号:US11770505

    申请日:2007-06-28

    IPC分类号: G06F7/00

    CPC分类号: G06F17/30864 G06F17/3064

    摘要: Techniques for identifying similar queries based on their overall similarity and partial similarity of time series of frequencies of the queries are provided. To identify queries that are similar to a target query, the query analysis system generates, for each query, an overall similarity score for that query and the target query based on the time series of the query and the target query. The query analysis system also generates, for each query, partial similarity scores for the query and the target query based on various time sub-series of the overall time series of the queries. The query analysis system then identifies queries as being similar to the target query based on the overall similarity scores and the partial similarity scores of the queries.

    摘要翻译: 提供了基于其查询的时间序列的总体相似性和部分相似性来识别类似查询的技术。 为了识别类似于目标查询的查询,查询分析系统根据查询和目标查询的时间序列为每个查询生成该查询和目标查询的总体相似性得分。 查询分析系统还根据查询的整个时间序列的各种时间子序列,为每个查询生成查询和目标查询的部分相似度分数。 然后,查询分析系统基于查询的总体相似性得分和部分相似性得分将查询识别为与目标查询相似。

    REPRESENTING QUERIES AND DETERMINING SIMILARITY BASED ON AN ARIMA MODEL
    25.
    发明申请
    REPRESENTING QUERIES AND DETERMINING SIMILARITY BASED ON AN ARIMA MODEL 失效
    基于ARIMA模型表示查询和确定相似度

    公开(公告)号:US20090006326A1

    公开(公告)日:2009-01-01

    申请号:US11770307

    申请日:2007-06-28

    IPC分类号: G06F17/30

    CPC分类号: G06Q30/02

    摘要: Representing queries and determining similarity of queries based on an autoregressive integrated moving average (“ARIMA”) model is provided. A query analysis system represents each query by its ARIMA coefficients. The query analysis system may estimate the frequency information for a desired past or future interval based on frequency information for some initial intervals. The query analysis system may also determine the similarity of a pair of queries based on the similarity of their ARIMA coefficients. The query analysis system may use various metrics, such as a correlation metric, to determine the similarity of the ARIMA coefficients.

    摘要翻译: 提供了基于自回归综合移动平均(“ARIMA”)模型的查询和确定查询的相似性。 查询分析系统通过其ARIMA系数表示每个查询。 查询分析系统可以基于一些初始间隔的频率信息来估计期望的过去或将来间隔的频率信息。 查询分析系统还可以基于它们的ARIMA系数的相似度来确定一对查询的相似性。 查询分析系统可以使用诸如相关度量的各种度量来确定ARIMA系数的相似性。

    FORECASTING SEARCH QUERIES BASED ON TIME DEPENDENCIES
    26.
    发明申请
    FORECASTING SEARCH QUERIES BASED ON TIME DEPENDENCIES 有权
    根据时间依赖性预测搜索查询

    公开(公告)号:US20090006313A1

    公开(公告)日:2009-01-01

    申请号:US11770462

    申请日:2007-06-28

    IPC分类号: G06F17/40

    CPC分类号: G06Q30/02

    摘要: Techniques for analyzing and modeling the frequency of queries are provided by a query analysis system. A query analysis system analyzes frequencies of a query over time to determine whether the query is time-dependent or time-independent. The query analysis system forecasts the frequency of time-dependent queries based on their periodicities. The query analysis system forecasts the frequency of time-independent queries based on causal relationships with other queries. To forecast the frequency of time-independent queries, the query analysis system analyzes the frequency of a query over time to identify significant increases in the frequency, which are referred to as “query events” or “events.” The query analysis system forecasts frequencies of time-independent queries based on queries with events that tend to causally precede events of the query to be forecasted.

    摘要翻译: 用于分析和建模查询频率的技术由查询分析系统提供。 查询分析系统分析查询的频率,以确定查询是时间依赖还是时间无关。 查询分析系统根据其周期性预测与时间相关的查询的频率。 查询分析系统根据与其他查询的因果关系预测与时间无关的查询的频率。 为了预测与时间无关的查询的频率,查询分析系统会随着时间的推移分析查询的频率,以确定频率的显着增加,这被称为“查询事件”或“事件”。 查询分析系统基于具有事件倾向于在要预测的查询的事件之前的查询来预测与时间无关的查询的频率。

    DETERMINATION OF TIME DEPENDENCY OF SEARCH QUERIES
    27.
    发明申请
    DETERMINATION OF TIME DEPENDENCY OF SEARCH QUERIES 失效
    确定搜索查询的时间依赖关系

    公开(公告)号:US20090006312A1

    公开(公告)日:2009-01-01

    申请号:US11770358

    申请日:2007-06-28

    IPC分类号: G06F7/00

    CPC分类号: G06F17/30864 G06Q30/02

    摘要: Techniques for analyzing and modeling the frequency of queries are provided by a query analysis system. A query analysis system analyzes frequencies of a query over time to determine whether the query is time-dependent or time-independent. The query analysis system forecasts the frequency of time-dependent queries based on their periodicities. The query analysis system forecasts the frequency of time-independent queries based on causal relationships with other queries. To forecast the frequency of time-independent queries, the query analysis system analyzes the frequency of a query over time to identify significant increases in the frequency, which are referred to as “query events” or “events.” The query analysis system forecasts frequencies of time-independent queries based on queries with events that tend to causally precede events of the query to be forecasted.

    摘要翻译: 用于分析和建模查询频率的技术由查询分析系统提供。 查询分析系统分析查询的频率,以确定查询是时间依赖还是时间无关。 查询分析系统根据其周期性预测与时间相关的查询的频率。 查询分析系统根据与其他查询的因果关系预测与时间无关的查询的频率。 为了预测与时间无关的查询的频率,查询分析系统会随着时间的推移分析查询的频率,以确定频率的显着增加,这被称为“查询事件”或“事件”。 查询分析系统基于具有事件倾向于在要预测的查询的事件之前的查询来预测与时间无关的查询的频率。

    Identification of similar queries based on overall and partial similarity of time series
    28.
    发明授权
    Identification of similar queries based on overall and partial similarity of time series 有权
    基于时间序列的总体和部分相似性识别类似查询

    公开(公告)号:US08290921B2

    公开(公告)日:2012-10-16

    申请号:US11770505

    申请日:2007-06-28

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30864 G06F17/3064

    摘要: Techniques for identifying similar queries based on their overall similarity and partial similarity of time series of frequencies of the queries are provided. To identify queries that are similar to a target query, the query analysis system generates, for each query, an overall similarity score for that query and the target query based on the time series of the query and the target query. The query analysis system also generates, for each query, partial similarity scores for the query and the target query based on various time sub-series of the overall time series of the queries. The query analysis system then identifies queries as being similar to the target query based on the overall similarity scores and the partial similarity scores of the queries.

    摘要翻译: 提供了基于其查询的时间序列的总体相似性和部分相似性来识别类似查询的技术。 为了识别类似于目标查询的查询,查询分析系统根据查询和目标查询的时间序列为每个查询生成该查询和目标查询的总体相似性得分。 查询分析系统还根据查询的整个时间序列的各种时间子序列,为每个查询生成查询和目标查询的部分相似度分数。 然后,查询分析系统基于查询的总体相似性得分和部分相似性得分将查询识别为与目标查询相似。

    PREDICTION OF FUTURE POPULARITY OF QUERY TERMS
    29.
    发明申请
    PREDICTION OF FUTURE POPULARITY OF QUERY TERMS 审中-公开
    预测未来的QUERY条款的普遍性

    公开(公告)号:US20090222321A1

    公开(公告)日:2009-09-03

    申请号:US12147468

    申请日:2008-06-26

    IPC分类号: G06F17/30

    摘要: Disclosed is a system and method that allows a computer system the ability to predict what query terms in a search will be popular. The system creates a unified model that determines the future popularity of a query term over a period of time in the future. The unified model averages the results of three different prediction models to obtain a prediction of the future popularity of a query term. The prediction from the unified model is compared against a threshold value of popularity over a time period. When the predicted popularity of the query exceeds the threshold the term is stored. In some embodiments the period that the term exceeds the threshold may also be stored.

    摘要翻译: 公开了一种系统和方法,其允许计算机系统预测搜索中的哪些查询术语将是流行的能力。 该系统创建一个统一的模型,确定未来一段时间内查询词的未来流行度。 统一模型对三种不同预测模型的结果进行平均,以获得对查询词的未来流行度的预测。 将统一模型的预测与一段时间内的人气阈值进行比较。 当查询的预测流行度超过阈值时,该项被存储。 在一些实施例中,术语超过阈值的周期也可以被存储。

    Method and system for determining similarity of items based on similarity objects and their features
    30.
    发明授权
    Method and system for determining similarity of items based on similarity objects and their features 有权
    基于相似对象及其特征确定项目相似度的方法和系统

    公开(公告)号:US07533094B2

    公开(公告)日:2009-05-12

    申请号:US10997749

    申请日:2004-11-23

    IPC分类号: G06F17/30

    摘要: A method and system for determining similarity between items is provided. To calculate similarity scores for pairs of items, the similarity system initializes a similarity score for each pair of objects and each pair of features. The similarity system then iteratively calculates the similarity scores for each pair of objects based on the similar scores of the pairs of features calculated during a previous iteration and calculates the similarity scores for each pair of features based on the similarity scores of the pairs of objects calculated during a previous iteration. The similarity system implements an algorithm that is based on a recursive definition of the similarities between objects and between features. The similarity system continues the iterations of recalculating the similarity scores until the similarity scores converge on a solution.

    摘要翻译: 提供了一种用于确定项目之间的相似性的方法和系统。 为了计算物品对的相似性分数,相似系统初始化每对物体和每对特征的相似性得分。 然后,相似系统基于在先前迭代期间计算的特征对的类似得分迭代地计算每对对象的相似性得分,并且基于计算出的对象对的相似性得分来计算每对特征的相似性得分 在之前的迭代。 相似系统实现了一种基于对象之间和特征之间的相似性的递归定义的算法。 相似系统继续重新计算相似性分数的迭代,直到相似性得分收敛于解。