Listwise Ranking
    81.
    发明申请
    Listwise Ranking 失效
    列表排名

    公开(公告)号:US20090106222A1

    公开(公告)日:2009-04-23

    申请号:US11874813

    申请日:2007-10-18

    IPC分类号: G06F7/00

    CPC分类号: G06F17/30864

    摘要: Procedures for learning and ranking items in a listwise manner are discussed. A listwise methodology may consider a ranked list, of individual items, as a specific permutation of the items being ranked. In implementations, a listwise loss function may be used in ranking items. A listwise loss function may be a metric which reflects the departure or disorder from an exemplary ranking for one or more sample listwise rankings used in learning. In this manner, the loss function may approximate the exemplary ranking for the plurality of items being ranked.

    摘要翻译: 讨论了以列表方式学习和排序项目的程序。 列表方法可以将个别项目的排名列表视为被排序的项目的具体置换。 在实现中,可以在排序项中使用列表丢失函数。 列表损失函数可以是反映学习中使用的一个或多个样本列表排序的示例性排名的偏离或混乱的度量。 以这种方式,损失函数可以近似排列的多个项目的示例性排名。

    LEARNING A DOCUMENT RANKING USING A LOSS FUNCTION WITH A RANK PAIR OR A QUERY PARAMETER
    82.
    发明申请
    LEARNING A DOCUMENT RANKING USING A LOSS FUNCTION WITH A RANK PAIR OR A QUERY PARAMETER 有权
    学习一个文件排序使用一个失败的功能与排名对或一个查询参数

    公开(公告)号:US20080027925A1

    公开(公告)日:2008-01-31

    申请号:US11460838

    申请日:2006-07-28

    IPC分类号: G06F17/30

    摘要: A method and system for generating a ranking function to rank the relevance of documents to a query is provided. The ranking system learns a ranking function from training data that includes queries, resultant documents, and relevance of each document to its query. The ranking system learns a ranking function using the training data by weighting incorrect rankings of relevant documents more heavily than the incorrect rankings of not relevant documents so that more emphasis is placed on correctly ranking relevant documents. The ranking system may also learn a ranking function using the training data by normalizing the contribution of each query to the ranking function so that it is independent of the number of relevant documents of each query.

    摘要翻译: 提供了一种用于生成用于将文档与查询的相关性排序的排序函数的方法和系统。 排名系统从包括查询,结果文档以及每个文档与其查询的相关性的训练数据中学习排名函数。 排名系统使用训练数据通过对相关文件的不正确排名加权比不相关文件的不正确排名更多地学习排名功能,以便更加重视正确排列相关文件。 排序系统还可以通过将每个查询的贡献归一化到排序函数来学习使用训练数据的排序函数,使得它独立于每个查询的相关文档的数量。

    User Information Needs Based Data Selection
    83.
    发明申请
    User Information Needs Based Data Selection 有权
    基于用户信息需求的数据选择

    公开(公告)号:US20120259831A1

    公开(公告)日:2012-10-11

    申请号:US13080510

    申请日:2011-04-05

    IPC分类号: G06F7/00 G06F17/30

    摘要: Techniques for determining user information needs and selecting data based on user information needs are described herein. The present disclosure describes extracting topics of interests to users from multiple sources including search log data and social network website, and assigns a budget to each topic to stipulate the quota of data to be selected for each topic. The present disclosure also describes calculating similarities between gathered data and the topics, and selecting top related data with each topic subject to limit of the budget. A search engine may use the techniques described here to select data for its index.

    摘要翻译: 本文描述了用于确定用户信息需求和基于用户信息需求选择数据的技术。 本公开内容描述了从多个源(包括搜索日志数据和社交网站)向用户提取兴趣的主题,并且为每个主题分配预算以规定要为每个主题选择的数据的配额。 本公开还描述了计算所收集的数据和主题之间的相似性,并且根据预算的限制来选择与每个主题相关的顶部相关数据。 搜索引擎可以使用这里描述的技术来选择其索引的数据。

    Graph-processing techniques for a MapReduce engine
    84.
    发明授权
    Graph-processing techniques for a MapReduce engine 有权
    MapReduce引擎的图形处理技术

    公开(公告)号:US08224825B2

    公开(公告)日:2012-07-17

    申请号:US12790942

    申请日:2010-05-31

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30584

    摘要: Systems, methods, and devices for sorting and processing various types of graph data are described herein. Partitioning graph data into master data and associated slave data allows for sorting of the graph data by sorting the master data. In another embodiment, promoting a data bucket having a first data bucket size to a data bucket having a second data bucket size greater than the first data bucket size upon reaching a memory limit allows for the reduction of temporary files output by the data bucket.

    摘要翻译: 这里描述了用于排序和处理各种类型的图形数据的系统,方法和装置。 将图形数据分割为主数据和关联的从属数据允许通过排序主数据对图形数据进行排序。 在另一个实施例中,在达到存储器限制时,将具有第一数据桶大小的数据桶推送到具有大于第一数据桶大小的第二数据桶大小的数据桶允许减少由数据桶输出的临时文件。

    Supervised rank aggregation based on rankings
    85.
    发明授权
    Supervised rank aggregation based on rankings 有权
    基于排名的监督排名聚合

    公开(公告)号:US08005784B2

    公开(公告)日:2011-08-23

    申请号:US12906010

    申请日:2010-10-15

    IPC分类号: G06F15/00 G06F15/18

    摘要: A method and system for rank aggregation of entities based on supervised learning is provided. A rank aggregation system provides an order-based aggregation of rankings of entities by learning weights within an optimization framework for combining the rankings of the entities using labeled training data and the ordering of the individual rankings. The rank aggregation system is provided with multiple rankings of entities. The rank aggregation system is also provided with training data that indicates the relative ranking of pairs of entities. The rank aggregation system then learns weights for each of the ranking sources by attempting to optimize the difference between the relative rankings of pairs of entities using the weights and the relative rankings of pairs of entities of the training data.

    摘要翻译: 提供了一种基于监督学习的实体等级聚合的方法和系统。 排名聚合系统通过在优化框架内学习权重来提供实体排序的基于订单的聚合,以使用标记的训练数据和个体排名的顺序组合实体的排名。 排名聚合系统提供多个实体排名。 等级聚合系统还提供了指示实体对的相对排名的训练数据。 秩聚合系统然后通过尝试使用训练数据的实体对的权重和相对排名来优化实体对的相对排名之间的差异来学习每个排名来源的权重。

    Active spam testing system
    86.
    发明授权
    Active spam testing system 有权
    主动垃圾邮件测试系统

    公开(公告)号:US07680851B2

    公开(公告)日:2010-03-16

    申请号:US11682971

    申请日:2007-03-07

    申请人: Tie-Yan Liu Hang Li

    发明人: Tie-Yan Liu Hang Li

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30864

    摘要: A method and system for introducing spam into a search engine for testing purposes is provided. An active spam testing system receives from a tester a specification of spam that is to be introduced into the search engine for testing purposes. The testing system may then generate auxiliary data structures for storing indications of the spam that is to be introduced. A search engine has original data structures that may include a content index and a link data structure. The testing system stores the indications of the spam in the auxiliary data structures so that use of the search engine for non-testing purposes is not affected. When the search engine is used for testing purposes, the search engine generates search results based on a combination of the original data structures and the auxiliary data structures.

    摘要翻译: 提供了一种将垃圾邮件引入搜索引擎进行测试的方法和系统。 主动的垃圾邮件测试系统从测试者那里收到将被引入搜索引擎进行测试的垃圾邮件的规范。 然后,测试系统可以产生用于存储要引入的垃圾邮件的指示的辅助数据结构。 搜索引擎具有可以包括内容索引和链接数据结构的原始数据结构。 测试系统将垃圾邮件的指示存储在辅助数据结构中,以便不影响使用搜索引擎进行非测试目的。 当搜索引擎用于测试目的时,搜索引擎将基于原始数据结构和辅助数据结构的组合生成搜索结果。

    Method and system for generating a classifier using inter-sample relationships
    87.
    发明授权
    Method and system for generating a classifier using inter-sample relationships 有权
    使用样本间关系生成分类器的方法和系统

    公开(公告)号:US07519217B2

    公开(公告)日:2009-04-14

    申请号:US10997073

    申请日:2004-11-23

    IPC分类号: G06K9/62 G06K9/68

    CPC分类号: G06K9/00711 G06K9/6292

    摘要: A method and system for generating a classifier to classify sub-objects of an object based on a relationship between sub-objects is provided. The classification system provides training sub-objects along with the actual classification of each training sub-object. The classification system may iteratively train sub-classifiers based on feature vectors representing the features of each sub-object, the actual classification of the sub-object, and a weight associated with the sub-object. After a sub-classifier is trained, the classification system classifies the training sub-objects using the trained sub-classifier. The classification system then adjusts the classifications based on relationships between training sub-objects. The classification system assigns a weight for the sub-classifier and weight for each sub-object based on the accuracy of the adjusted classifications.

    摘要翻译: 提供了一种用于生成分类器以根据子对象之间的关系对对象的子对象进行分类的方法和系统。 分类系统提供训练子对象以及每个训练子对象的实际分类。 分类系统可以基于表示每个子对象的特征,子对象的实际分类以及与子对象相关联的权重的特征向量迭代地训练子分类器。 在训练子分类器之后,分类系统使用经过训练的子分类器对训练子对象进行分类。 然后,分类系统基于训练子对象之间的关系来调整分类。 分类系统根据调整后的分类的准确性为每个子对象分配子分类器的权重和权重。

    Method and system for detecting black frames in a sequence of frames
    88.
    发明申请
    Method and system for detecting black frames in a sequence of frames 有权
    用于检测帧序列中的黑帧的方法和系统

    公开(公告)号:US20060110057A1

    公开(公告)日:2006-05-25

    申请号:US10997071

    申请日:2004-11-23

    IPC分类号: G06K9/36

    CPC分类号: H04N17/004

    摘要: Methods and systems for identifying black frames within a sequence of frames are provided. In one embodiment, the detection system detects black frames within a sequence of frames by fully decoding base frames and then partially decoding non-black, non-base frames in a way that ensures the blackness of each frame can be determined. The detection system decodes base frames before decoding dependent frames, which is referred to as processing frames in reverse order of dependency since a frame is processed before the frames that depend on it are processed. In another embodiment, the detection system determines the blackness of frames within a sequence of frames by processing the frames in order of their dependency and following chains of block dependency to decode and determine the blackness of blocks.

    摘要翻译: 提供了用于识别帧序列内的黑帧的方法和系统。 在一个实施例中,检测系统通过完全解码基本帧然后以确保每帧的黑度可以确定的方式部分地解码非黑色非基本帧来检测帧序列内的黑帧。 检测系统在解码相关帧之前对基本帧进行解码,其被称为处理帧,其依赖性相反,因为在处理依赖帧的帧被处理之前处理帧。 在另一个实施例中,检测系统通过按照它们的依赖性顺序处理帧并且跟随块依赖性链来解码和确定块的黑度来确定帧序列内的帧的黑度。

    Noise Tolerant Graphical Ranking Model
    89.
    发明申请
    Noise Tolerant Graphical Ranking Model 审中-公开
    噪声容限图形排序模型

    公开(公告)号:US20120271821A1

    公开(公告)日:2012-10-25

    申请号:US13090848

    申请日:2011-04-20

    IPC分类号: G06F17/30

    CPC分类号: G06F16/3346

    摘要: The relevance of an object, such as a document resulting from a query, may be determined automatically. A graphical model-based technique is applied to determine the relevance of the object. The graphical model may represent relationships between actual and observed labels for the object, based on features of the object. The graphical model may take into account an assumption of noisy training data by modeling the noise.

    摘要翻译: 可以自动确定对象(例如由查询产生的文档)的相关性。 应用基于图形模型的技术来确定对象的相关性。 图形模型可以基于对象的特征来表示对象的实际标签和观察标签之间的关系。 图形模型可以通过对噪声建模来考虑噪声训练数据的假设。

    Multi-ranker for search
    90.
    发明授权
    Multi-ranker for search 有权
    多人游戏搜索

    公开(公告)号:US08122015B2

    公开(公告)日:2012-02-21

    申请号:US11859066

    申请日:2007-09-21

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/3053

    摘要: Systems and methods for processing user queries and identifying a set of documents relevant to the user query from a database using multi ranker search are described. In one implementation, the retrieved documents can be paired to form document pairs, or instance pairs, in a variety of combinations. Such instance pairs may have a rank order between them as they all have different ranks. A classifier, hyperplane, and a base ranker may be constructed for identifying the rank order relationships between the two instances in an instance pair. The base ranker may be generated for each rank pair. The systems use a divide and conquer strategy for learning to rank the instance pairs by employing multiple hyperplanes and aggregate the base rankers to form an ensemble of base rankers. Such an ensemble of base rankers can be used to rank the documents or instances.

    摘要翻译: 描述了用于处理用户查询的系统和方法,以及使用多游标搜索从数据库识别与用户查询相关的一组文档。 在一个实现中,检索到的文档可以被配对以形成各种组合的文档对或实例对。 这样的实例对可以在它们之间具有排序,因为它们都具有不同的等级。 可以构造一个分类器,超平面和基本游标,用于识别实例对中的两个实例之间的排序关系。 可以为每个等级对生成基本杀手。 系统使用分裂和征服策略来学习通过使用多个超平面来对实例对进行排名,并且聚合基本等级以形成基本等级的组合。 可以使用这样一个基本排名的组合对文档或实例进行排名。