CLICK MODEL FOR SEARCH RANKINGS
    11.
    发明申请
    CLICK MODEL FOR SEARCH RANKINGS 有权
    点击模式搜索排名

    公开(公告)号:US20100125570A1

    公开(公告)日:2010-05-20

    申请号:US12273425

    申请日:2008-11-18

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30864

    摘要: Approaches and techniques are discussed for ranking the documents indicated in search results for a query based on click-through information collected for the query in previous query sessions. According to an embodiment of the invention, when calculating a relevance score for a particular document, one may overcome positional bias by utilizing click-through information about other documents previously returned in the same search results as the particular document. According to an embodiment, one may utilize Dynamic Bayesian Network, based on said click-through information, to model relevance. According to an embodiment of the invention, one may utilize click-through information to generate targets for learning a ranking function.

    摘要翻译: 讨论方法和技术,用于根据在以前的查询会话中为查询收集的点击信息对查询的搜索结果中指示的文档进行排名。 根据本发明的实施例,当计算特定文档的相关性得分时,可以通过利用与特定文档相同的搜索结果中先前返回的其他文档的点击信息来克服位置偏差。 根据实施例,可以基于所述点击信息来利用动态贝叶斯网络来模拟相关性。 根据本发明的实施例,可以利用点击信息来生成用于学习排名功能的目标。

    Hierarchical Recognition Through Semantic Embedding
    12.
    发明申请
    Hierarchical Recognition Through Semantic Embedding 审中-公开
    通过语义嵌入的层次识别

    公开(公告)号:US20090271339A1

    公开(公告)日:2009-10-29

    申请号:US12111500

    申请日:2008-04-29

    IPC分类号: G06F15/18

    CPC分类号: G06N20/00

    摘要: Computer-implemented systems and methods, including servers, perform structure-based recognition processes that include matching and classification. Preprocessing subsystems and sub-methods embed a set of classes on which a loss function is defined into a semantic space and learn an input mapping between an input space and the semantic space. Recognition subsystems and methods accept a test object, representable in the input space, and apply the input mapping to the test object as part of a recognition process.

    摘要翻译: 计算机实现的系统和方法,包括服务器,执行基于结构的识别过程,包括匹配和分类。 预处理子系统和子方法将一组将损失函数定义到一个语义空间中的类进行嵌入,并学习输入空间和语义空间之间的输入映射。 识别子系统和方法接受在输入空间中表示的测试对象,并将输入映射应用于测试对象作为识别过程的一部分。

    Data Mining Unlearnable Data Sets
    13.
    发明申请
    Data Mining Unlearnable Data Sets 审中-公开
    数据挖掘不可靠的数据集

    公开(公告)号:US20080027886A1

    公开(公告)日:2008-01-31

    申请号:US11572193

    申请日:2005-07-18

    IPC分类号: G06G7/00

    摘要: This invention concerns data mining, that is the extraction of information, from “unlearnable” data sets. In particular it concerns apparatus and a method for this purpose. The invention involves creating a finite training sample from the data set (14). Then training (50) a learning device (32) using a supervised learning algorithm to predict labels for each item of the training sample. Then processing other data from the data set with the trained learning device to predict labels and determining whether the predicted labels are better (learnable) or worse (anti-learnable) than random guessing (52). And, using a reverser (34) to apply negative weighting to the predicted labels if it is worse (anti-learnable) (54).

    摘要翻译: 本发明涉及数据挖掘,即从“不可理解”的数据集中提取信息。 特别地,它涉及用于此目的的装置和方法。 本发明涉及从数据集(14)创建有限训练样本。 然后使用监督学习算法训练(50)学习装置(32)来预测训练样本的每个项目的标签。 然后利用训练有素的学习装置处理来自数据集的其他数据,以预测标签,并确定预测标签是否比随机猜测更好(可学习)或更差(可反学习)(52)。 并且,如果反转(34)更糟(反学习),则使用反向器(34)对预测标签应用负权重(54)。

    System and method for training a multi-class support vector machine to select a common subset of features for classifying objects
    14.
    发明授权
    System and method for training a multi-class support vector machine to select a common subset of features for classifying objects 有权
    用于训练多类支持向量机的系统和方法,以选择用于分类对象的特征的公共子集

    公开(公告)号:US07836000B2

    公开(公告)日:2010-11-16

    申请号:US12001932

    申请日:2007-12-10

    CPC分类号: G06K9/6249 G06K9/6269

    摘要: An improved system and method is provided for training a multi-class support vector machine to select a common subset of features for classifying objects. A multi-class support vector machine generator may be provided for learning classification functions to classify sets of objects into classes and may include a sparse support vector machine modeling engine for training a multi-class support vector machine using scaling factors by simultaneously selecting a common subset of features iteratively for all classes from sets of features representing each of the classes. An objective function using scaling factors to ensure sparsity of features may be iteratively minimized, and features may be retained and added until a small set of features stabilizes. Alternatively, a common subset of features may be found by iteratively removing at least one feature simultaneously for all classes from an active set of features initialized to represent the entire set of training features.

    摘要翻译: 提供了一种改进的系统和方法,用于训练多类支持向量机以选择用于分类对象的特征的公共子集。 可以提供多类支持向量机生成器用于学习分类功能以将对象集合分类到类中,并且可以包括稀疏支持向量机建模引擎,用于使用缩放因子来同时选择公共子集来训练多类支持向量机 的特征迭代地为表示每个类的特征的集合的所有类。 使用缩放因子以确保特征的稀疏性的目标函数可以被迭代地最小化,并且可以保留和添加特征,直到一小组特征稳定。 或者,可以通过从被初始化为表示整套训练特征的活动特征集合中的所有类别同时迭代地去除至少一个特征来发现特征的公共子集。

    GRADIENT BASED OPTIMIZATION OF A RANKING MEASURE
    15.
    发明申请
    GRADIENT BASED OPTIMIZATION OF A RANKING MEASURE 有权
    排名测度的梯度优化

    公开(公告)号:US20090089274A1

    公开(公告)日:2009-04-02

    申请号:US11863453

    申请日:2007-09-28

    申请人: Olivier Chapelle

    发明人: Olivier Chapelle

    IPC分类号: G06F7/00 G06F17/15

    CPC分类号: G06F17/30675 G06F17/30864

    摘要: Methods, systems, and apparatuses for generating relevance functions for ranking documents obtained in searches are provided. One or more features to be used as predictor variables in the construction of a relevance function are determined. The relevance function is parameterized by one or more coefficients. A query error is defined that measures a difference between a relevance ranking generated by the relevance function and a training set relevance ranking based on a query and a set of scored documents associated with the query. The query error is a continuous function of the coefficients and aims at approximating errors measures commonly used in Information Retrieval. Values for the coefficients of the relevance function are determined that substantially minimize an objective function that depends on the defined query error.

    摘要翻译: 提供了用于产生用于对在搜索中获得的文档进行排序的相关性功能的方法,系统和装置。 确定在构建相关函数中用作预测变量的一个或多个特征。 相关函数由一个或多个系数参数化。 定义了一种查询错误,其测量相关性功能产生的相关性排名与基于查询的一组训练集相关性排序与与该查询相关联的一组计分文档之间的差异。 查询错误是系数的连续函数,旨在近似信息检索中常用的错误度量。 确定相关函数的系数的值,其基本上最小化取决于定义的查询错误的目标函数。

    GLOBAL AND TOPICAL RANKING OF SEARCH RESULTS USING USER CLICKS
    16.
    发明申请
    GLOBAL AND TOPICAL RANKING OF SEARCH RESULTS USING USER CLICKS 审中-公开
    使用用户点击搜索结果的全球和主题排名

    公开(公告)号:US20110029517A1

    公开(公告)日:2011-02-03

    申请号:US12533564

    申请日:2009-07-31

    IPC分类号: G06F17/30

    CPC分类号: G06F16/951

    摘要: To estimate, or predict, the relevance of items, or documents, in a set of search results, relevance information is extracted from user click data, and relational information among the documents as manifested by an aggregation of user clicks is determined from the click data. A supervised approach uses judgment information, such as human judgment information, as part of the training data used to generate a relevance predictor model, which minimizes the inherent noisiness of the click data collected from a commercial search engine.

    摘要翻译: 为了在一组搜索结果中估计或预测项目或文档的相关性,从用户点击数据中提取相关性信息,并且从点击数据确定由用户点击的聚合表现的文档之间的关系信息 。 受监督的方法使用诸如人类判断信息之类的判断信息作为用于生成相关性预测器模型的训练数据的一部分,其使从商业搜索引擎收集的点击数据的固有噪声最小化。

    EFFICIENT ALGORITHM FOR PAIRWISE PREFERENCE LEARNING
    17.
    发明申请
    EFFICIENT ALGORITHM FOR PAIRWISE PREFERENCE LEARNING 有权
    高效优先学习的有效算法

    公开(公告)号:US20110016065A1

    公开(公告)日:2011-01-20

    申请号:US12504460

    申请日:2009-07-16

    IPC分类号: G06F15/18

    CPC分类号: G06N99/005

    摘要: In one embodiment, training a ranking model comprises: accessing the ranking model and an objective function of the ranking model; accessing one or more preference pairs of objects, wherein for each of the preference pairs of objects comprising a first object and a second object, there is a preference between the first object and the second object with respect to the particular reference, and the first object and the second object each has a feature vector comprising one or more feature values; and training the ranking model by minimizing the objective function using the preference pairs of objects, wherein for each of the preference pairs of objects, a difference between the first feature vector of the first object and the second feature vector of the second object is not calculated.

    摘要翻译: 在一个实施例中,训练排名模型包括:访问排名模型和排名模型的目标函数; 访问一个或多个偏好对对,其中对于包括第一对象和第二对象的对象的每个优选对,在第一对象和第二对象之间存在关于特定引用的偏好,并且第一对象 并且所述第二对象各自具有包括一个或多个特征值的特征向量; 并且通过使用对象的偏好对最小化目标函数来训练排名模型,其中对于每个偏好对的对象,不计算第一对象的第一特征向量与第二对象的第二特征向量之间的差异 。

    OPTIMIZATION OF RANKING MEASURES AS A STRUCTURED OUTPUT PROBLEM
    18.
    发明申请
    OPTIMIZATION OF RANKING MEASURES AS A STRUCTURED OUTPUT PROBLEM 有权
    排名方法作为结构化输出问题的优化

    公开(公告)号:US20090138463A1

    公开(公告)日:2009-05-28

    申请号:US11946552

    申请日:2007-11-28

    申请人: Olivier Chapelle

    发明人: Olivier Chapelle

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30867

    摘要: Methods, systems, and apparatuses for generating relevance functions for ranking documents obtained in searches are provided. One or more features to be used as predictor variables in the construction of a relevance function are determined. The relevance function is parameterized by one or more coefficients. An ideal query error is defined that measures, for a given query, a difference between a ranking generated by the relevance function and a ranking based on a training set. According to a structured output learning framework, values for the coefficients of the relevance function are determined to substantially minimize an objective function that depends on a continuous upper bound of the defined ideal query error.

    摘要翻译: 提供了用于产生用于对在搜索中获得的文档进行排序的相关性功能的方法,系统和装置。 确定在构建相关函数中用作预测变量的一个或多个特征。 相关函数由一个或多个系数参数化。 定义理想的查询错误,对于给定的查询,针对由相关性函数产生的排名与基于训练集的排名​​之间的差异的度量。 根据结构化输出学习框架,确定相关函数的系数的值,以便基本上最小化取决于所定义的理想查询误差的连续上限的目标函数。

    Kernels and kernel methods for spectral data
    19.
    发明申请
    Kernels and kernel methods for spectral data 有权
    光谱数据的内核和核心方法

    公开(公告)号:US20050228591A1

    公开(公告)日:2005-10-13

    申请号:US10267977

    申请日:2002-10-09

    摘要: Support vector machines are used to classify data contained within a structured dataset such as a plurality of signals generated by a spectral analyzer. The signals are preprocessed to ensure alignment of peaks across the spectra. Similarity measures are constructed to provide a basis for comparison of pairs of samples of the signal. A support vector machine is trained to discriminate between different classes of the samples. to identify the most predictive features within the spectra. In a preferred embodiment feature selection is performed to reduce the number of features that must be considered.

    摘要翻译: 支持向量机用于对包含在结构化数据集中的数据进行分类,例如由频谱分析仪产生的多个信号。 信号被预处理以确保谱峰的峰对准。 构建相似性度量以提供用于比较信号样本对的基础。 训练支持向量机以区分不同类别的样本。 以识别光谱中最具预测性的特征。 在优选实施例中,执行特征选择以减少必须考虑的特征的数量。

    Click model for search rankings
    20.
    发明授权
    Click model for search rankings 有权
    点击型号搜索排名

    公开(公告)号:US08671093B2

    公开(公告)日:2014-03-11

    申请号:US12273425

    申请日:2008-11-18

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30864

    摘要: Approaches and techniques are discussed for ranking the documents indicated in search results for a query based on click-through information collected for the query in previous query sessions. According to an embodiment of the invention, when calculating a relevance score for a particular document, one may overcome positional bias by utilizing click-through information about other documents previously returned in the same search results as the particular document. According to an embodiment, one may utilize Dynamic Bayesian Network, based on said click-through information, to model relevance. According to an embodiment of the invention, one may utilize click-through information to generate targets for learning a ranking function.

    摘要翻译: 讨论方法和技术,用于根据在以前的查询会话中为查询收集的点击信息对查询的搜索结果中指示的文档进行排名。 根据本发明的实施例,当计算特定文档的相关性得分时,可以通过利用与特定文档相同的搜索结果中先前返回的其他文档的点击信息来克服位置偏差。 根据实施例,可以基于所述点击信息来利用动态贝叶斯网络来模拟相关性。 根据本发明的实施例,可以利用点击信息来产生用于学习排名功能的目标。