LEARNING A DOCUMENT RANKING USING A LOSS FUNCTION WITH A RANK PAIR OR A QUERY PARAMETER
    31.
    发明申请
    LEARNING A DOCUMENT RANKING USING A LOSS FUNCTION WITH A RANK PAIR OR A QUERY PARAMETER 有权
    学习一个文件排序使用一个失败的功能与排名对或一个查询参数

    公开(公告)号:US20080027925A1

    公开(公告)日:2008-01-31

    申请号:US11460838

    申请日:2006-07-28

    IPC分类号: G06F17/30

    摘要: A method and system for generating a ranking function to rank the relevance of documents to a query is provided. The ranking system learns a ranking function from training data that includes queries, resultant documents, and relevance of each document to its query. The ranking system learns a ranking function using the training data by weighting incorrect rankings of relevant documents more heavily than the incorrect rankings of not relevant documents so that more emphasis is placed on correctly ranking relevant documents. The ranking system may also learn a ranking function using the training data by normalizing the contribution of each query to the ranking function so that it is independent of the number of relevant documents of each query.

    摘要翻译: 提供了一种用于生成用于将文档与查询的相关性排序的排序函数的方法和系统。 排名系统从包括查询,结果文档以及每个文档与其查询的相关性的训练数据中学习排名函数。 排名系统使用训练数据通过对相关文件的不正确排名加权比不相关文件的不正确排名更多地学习排名功能,以便更加重视正确排列相关文件。 排序系统还可以通过将每个查询的贡献归一化到排序函数来学习使用训练数据的排序函数,使得它独立于每个查询的相关文档的数量。

    Correlating Categories Using Taxonomy Distance and Term Space Distance
    32.
    发明申请
    Correlating Categories Using Taxonomy Distance and Term Space Distance 有权
    使用分类距离和术语空间距离相关分类

    公开(公告)号:US20070214186A1

    公开(公告)日:2007-09-13

    申请号:US11375606

    申请日:2006-03-13

    IPC分类号: G06F17/30

    CPC分类号: G06K9/6282 Y10S707/99936

    摘要: A method and system for determining similarity or correlation between categories of a hierarchical taxonomy for documents by combining heterogeneous similarity metrics is provided. A correlation system uses both a taxonomy distance metric and a term space distance metric to represent the similarity between categories. The correlation system finds a new distance metric for categories that factors in both the taxonomy distance metric and the term space distance metric. The new distance metric can then be used by classifiers to more accurately represent the correlation between categories.

    摘要翻译: 提供了一种用于通过组合异构相似性度量来确定用于文档的分级分类法的类别之间的相似性或相关性的方法和系统。 相关系统使用分类距离度量和术语空间距离度量来表示类别之间的相似性。 相关系统为分类学距离度量和术语空间距离度量的类别找到新的距离度量。 分类器可以使用新的距离度量来更精确地表示类别之间的相关性。

    Event detection based on evolution of click-through data
    33.
    发明申请
    Event detection based on evolution of click-through data 有权
    基于点击数据演进的事件检测

    公开(公告)号:US20070214115A1

    公开(公告)日:2007-09-13

    申请号:US11375610

    申请日:2006-03-13

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30864

    摘要: A method and system for detecting events based on query-page relationships is provided. The event detection system detects events by analyzing occurrences of query-page pairs generated from a user selecting the page of the pair from a search result for the query of the pair. The event detection system may identify semantic and temporal similarity between query-page pairs. The event detection system then identifies clusters of query-page pairs that are semantically and temporally similar.

    摘要翻译: 提供了一种基于查询页面关系检测事件的方法和系统。 事件检测系统通过分析从用户从搜索对的查询结果中选择对的页面的用户生成的查询页面对的发生来检测事件。 事件检测系统可以识别查询页对之间的语义和时间相似性。 事件检测系统然后识别在语义和时间上相似的查询页对的簇。

    Projecting queries and images into a similarity space
    34.
    发明申请
    Projecting queries and images into a similarity space 有权
    将查询和图像投影到相似性空间中

    公开(公告)号:US20070214114A1

    公开(公告)日:2007-09-13

    申请号:US11375528

    申请日:2006-03-13

    IPC分类号: G06F17/30

    摘要: A method and system for projecting queries and images into a similarity space where queries are close to their relevant images is provided. A similarity space projection (“SSP”) system learns a query projection function and an image projection function based on training data. The query projection function projects the relevance of the most relevant words of a query into a similarity space and the image projection function projects the relevance to an image of the most relevant words of a query into the same similarity space so that queries and their relevant images are close in the similarity space. The SSP system can then identify images that are relevant to a target query and queries that are relevant to a target image using the projection functions.

    摘要翻译: 提供了一种用于将查询和图像投影到查询接近其相关图像的相似空间中的方法和系统。 相似度空间投影(“SSP”)系统基于训练数据学习查询投影函数和图像投影函数。 查询投影函数将查询中最相关的词的相关性投射到相似度空间中,并且图像投影函数将与查询中最相关的词的图像的相关性投射到相同的相似度空间中,使得查询及其相关图像 在相似性空间中很近。 然后,SSP系统可以使用投影函数来识别与目标查询相关的图像和与目标图像相关的查询。

    Augmenting a training set for document categorization
    35.
    发明申请
    Augmenting a training set for document categorization 有权
    增加文件分类培训

    公开(公告)号:US20070112753A1

    公开(公告)日:2007-05-17

    申请号:US11273714

    申请日:2005-11-14

    IPC分类号: G06F17/30

    摘要: A method and system for augmenting a training set used to train a classifier of documents is provided. The augmentation system augments a training set with training data derived from features of documents based on a document hierarchy. The training data of the initial training set may be derived from the root documents of the hierarchies of documents. The augmentation system generates additional training data that includes an aggregate feature that represents the overall characteristics of a hierarchy of documents, rather than just the root document. After the training data is generated, the augmentation system augments the initial training set with the newly generated training data.

    摘要翻译: 提供了一种用于增加用于训练文档分类器的训练集的方法和系统。 增强系统使用基于文档层次结构的文档特征从训练数据中增加训练集。 初始训练集的训练数据可以从文档层级的根文档中导出。 增强系统生成额外的培训数据,其中包括表示文档层次结构的整体特征的聚合特征,而不仅仅是根文档。 在产生训练数据之后,增强系统利用新生成的训练数据增加初始训练集。

    Active prediction of diverse search intent based upon user browsing behavior

    公开(公告)号:US10204163B2

    公开(公告)日:2019-02-12

    申请号:US12762423

    申请日:2010-04-19

    申请人: Bin Gao Tie-Yan Liu

    发明人: Bin Gao Tie-Yan Liu

    IPC分类号: G06F17/30

    摘要: Many search engines attempt to understand and predict a user's search intent after the submission of search queries. Predicting search intent allows search engines to tailor search results to particular information needs of the user. Unfortunately, current techniques passively predict search intent after a query is submitted. Accordingly, one or more systems and/or techniques for actively predicting search intent from user browsing behavior data are disclosed herein. For example, search patterns of a user browsing a web page and shortly thereafter performing a query may be extracted from user browsing behavior. Queries within the search patterns may be ranked based upon a search trigger likelihood that content of the web page motivated the user to perform the query. In this way, query suggestions having a high search trigger likelihood and a diverse range of topics may be generated and/or presented to users of the web page.

    Cost-Per-Action Model Based on Advertiser-Reported Actions
    37.
    发明申请
    Cost-Per-Action Model Based on Advertiser-Reported Actions 审中-公开
    基于广告商报告的动作的每次操作费用模型

    公开(公告)号:US20130246167A1

    公开(公告)日:2013-09-19

    申请号:US13421626

    申请日:2012-03-15

    IPC分类号: G06Q30/02

    CPC分类号: G06Q30/0256

    摘要: According to a cost-per-action advertising model, advertisers submit ads with cost-per-action bids. Ad auctions are conducted and winning ads are returned with contextually relevant search results. Each time a winning ad is selected by a user, resulting in the user being redirected to a website associated with the advertiser, a selected impression and a price is recorded for the winning ad. Periodically, an advertiser submits a report indicating a number of actions attributed to the ads that have occurred through the advertiser website. The advertiser is then charged a fee for each reported action based on the recorded prices for the winning ads and based on the number of selected impressions recorded for the winning ads.

    摘要翻译: 根据每次操作费用广告模式,广告客户会按照每次操作费用出价提交广告。 进行广告拍卖,并返回具有内容相关搜索结果的获胜广告。 每当用户选择获胜广告时,导致用户被重定向到与广告商相关联的网站,则为获胜广告记录所选择的展示和价格。 定期地,广告客户会提交一份报告,指示通过广告客户网站发生的广告归因的一些操作。 然后,根据获胜广告的记录价格并根据为获胜广告记录的所选曝光次数,为每个报告的动作收取费用。

    Calculating web page importance based on a conditional markov random walk
    38.
    发明授权
    Calculating web page importance based on a conditional markov random walk 有权
    基于条件马尔可夫随机游走计算网页重要性

    公开(公告)号:US08145592B2

    公开(公告)日:2012-03-27

    申请号:US12370573

    申请日:2009-02-12

    IPC分类号: G06F15/00 G06F15/18

    CPC分类号: G06F17/30864 G06F17/30882

    摘要: An importance system calculates the importance of pages using a conditional Markov random walk model rather than a conventional Markov random walk model. The importance system calculates the importance of pages factoring in the importance of sites that contain those pages. The importance system may factor in the importance of sites based on the strength of the correlation of the importance of a page to the importance of a site. The strength of the correlation may be based upon the depth of the page within the site. The importance system may iteratively calculate the importance of the pages using “conditional” transition probabilities. During each iteration, the importance system may recalculate the conditional transition probabilities based on the importance of sites that are derived from the recalculated importance of pages during the iteration.

    摘要翻译: 重要性系统使用条件马尔可夫随机游走模型而不是传统的马尔可夫随机游走模型来计算页面的重要性。 重要性系统计算页面因素对包含这些页面的网站重要性的重要性。 重要性系统可以基于网页的重要性与网站重要性的相关性的强度来考虑网站的重要性。 相关性的强度可以基于站点内页面的深度。 重要性系统可以迭代地计算使用“条件”转移概率的页面的重要性。 在每次迭代期间,重要性系统可以基于在迭代期间从页面的重新计算的重要性导出的站点的重要性来重新计算条件转换概率。

    Supervised rank aggregation based on rankings
    39.
    发明授权
    Supervised rank aggregation based on rankings 有权
    基于排名的监督排名聚合

    公开(公告)号:US08005784B2

    公开(公告)日:2011-08-23

    申请号:US12906010

    申请日:2010-10-15

    IPC分类号: G06F15/00 G06F15/18

    摘要: A method and system for rank aggregation of entities based on supervised learning is provided. A rank aggregation system provides an order-based aggregation of rankings of entities by learning weights within an optimization framework for combining the rankings of the entities using labeled training data and the ordering of the individual rankings. The rank aggregation system is provided with multiple rankings of entities. The rank aggregation system is also provided with training data that indicates the relative ranking of pairs of entities. The rank aggregation system then learns weights for each of the ranking sources by attempting to optimize the difference between the relative rankings of pairs of entities using the weights and the relative rankings of pairs of entities of the training data.

    摘要翻译: 提供了一种基于监督学习的实体等级聚合的方法和系统。 排名聚合系统通过在优化框架内学习权重来提供实体排序的基于订单的聚合,以使用标记的训练数据和个体排名的顺序组合实体的排名。 排名聚合系统提供多个实体排名。 等级聚合系统还提供了指示实体对的相对排名的训练数据。 秩聚合系统然后通过尝试使用训练数据的实体对的权重和相对排名来优化实体对的相对排名之间的差异来学习每个排名来源的权重。

    Spectral clustering using sequential matrix compression
    40.
    发明授权
    Spectral clustering using sequential matrix compression 失效
    使用顺序矩阵压缩的光谱聚类

    公开(公告)号:US07974977B2

    公开(公告)日:2011-07-05

    申请号:US11743942

    申请日:2007-05-03

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06K9/6224 G06F17/3071

    摘要: A clustering system generates an original Laplacian matrix representing objects and their relationships. The clustering system initially applies an eigenvalue decomposition solver to the original Laplacian matrix for a number of iterations. The clustering system then identifies the elements of the resultant eigenvector that are stable. The clustering system then aggregates the elements of the original Laplacian matrix corresponding to the identified stable elements and forms a new Laplacian matrix that is a compressed form of the original Laplacian matrix. The clustering system repeats the applying of the eigenvalue decomposition solver and the generating of new compressed Laplacian matrices until the new Laplacian matrix is small enough so that a final solution can be generated in a reasonable amount of time.

    摘要翻译: 聚类系统生成表示对象及其关系的原始拉普拉斯矩阵。 聚类系统首先将特征值分解求解器应用于原始拉普拉斯矩阵进行多次迭代。 然后,聚类系统识别所得到的特征向量的元素是稳定的。 然后,聚类系统聚合对应于所识别的稳定元素的原始拉普拉斯矩阵的元素,并形成作为原始拉普拉斯矩阵的压缩形式的新的拉普拉斯矩阵。 聚类系统重复应用特征值分解求解器和生成新的压缩拉普拉斯矩阵,直到新的拉普拉斯矩阵足够小,以便在合理的时间内生成最终解。