METHOD AND SYSTEM FOR DISTRIBUTED MACHINE LEARNING
    21.
    发明申请
    METHOD AND SYSTEM FOR DISTRIBUTED MACHINE LEARNING 有权
    分布式机器学习方法与系统

    公开(公告)号:US20130290223A1

    公开(公告)日:2013-10-31

    申请号:US13458545

    申请日:2012-04-27

    IPC分类号: G06F15/18

    CPC分类号: G06N99/005 G06F15/18

    摘要: Method, system, and programs for distributed machine learning on a cluster including a plurality of nodes are disclosed. A machine learning process is performed in each of the plurality of nodes based on a respective subset of training data to calculate a local parameter. The training data is partitioned over the plurality of nodes. A plurality of operation nodes are determined from the plurality of nodes based on a status of the machine learning process performed in each of the plurality of nodes. The plurality of operation nodes are connected to form a network topology. An aggregated parameter is generated by merging local parameters calculated in each of the plurality of operation nodes in accordance with the network topology.

    摘要翻译: 公开了包括多个节点在内的分布式机器学习的方法,系统和程序。 基于训练数据的相应子集,在多个节点的每一个中执行机器学习处理,以计算局部参数。 训练数据在多个节点上分区。 基于在多个节点中的每一个中执行的机器学习处理的状态,从多个节点确定多个操作节点。 多个操作节点被连接以形成网络拓扑。 通过根据网络拓扑结合在多个操作节点中的每一个中计算的局部参数来生成聚合参数。

    Determining a relevance function based on a query error derived using a structured output learning technique
    22.
    发明授权
    Determining a relevance function based on a query error derived using a structured output learning technique 有权
    基于使用结构化输出学习技术导出的查询错误确定相关函数

    公开(公告)号:US08005774B2

    公开(公告)日:2011-08-23

    申请号:US11946552

    申请日:2007-11-28

    申请人: Olivier Chapelle

    发明人: Olivier Chapelle

    IPC分类号: G06F15/18 G06F17/10 G06F17/30

    CPC分类号: G06F17/30867

    摘要: Methods, systems, and apparatuses for generating relevance functions for ranking documents obtained in searches are provided. One or more features to be used as predictor variables in the construction of a relevance function are determined. The relevance function is parameterized by one or more coefficients. An ideal query error is defined that measures, for a given query, a difference between a ranking generated by the relevance function and a ranking based on a training set. According to a structured output learning framework, values for the coefficients of the relevance function are determined to substantially minimize an objective function that depends on a continuous upper bound of the defined ideal query error. The query error is determined using a structured output learning technique. The query error is defined as a maximum over a set of permutations.

    摘要翻译: 提供了用于产生用于对在搜索中获得的文档进行排序的相关性功能的方法,系统和装置。 确定在构建相关函数中用作预测变量的一个或多个特征。 相关函数由一个或多个系数参数化。 定义理想的查询错误,对于给定的查询,针对由相关性函数产生的排名与基于训练集的排名​​之间的差异的度量。 根据结构化输出学习框架,确定相关函数的系数的值,以便基本上最小化取决于所定义的理想查询误差的连续上限的目标函数。 使用结构化输出学习技术确定查询错误。 查询错误被定义为一组排列中的最大值。

    Gradient based optimization of a ranking measure
    23.
    发明授权
    Gradient based optimization of a ranking measure 有权
    基于梯度的优化排名测度

    公开(公告)号:US07895198B2

    公开(公告)日:2011-02-22

    申请号:US11863453

    申请日:2007-09-28

    申请人: Olivier Chapelle

    发明人: Olivier Chapelle

    IPC分类号: G06F17/30 G06F7/00

    CPC分类号: G06F17/30675 G06F17/30864

    摘要: Methods, systems, and apparatuses for generating relevance functions for ranking documents obtained in searches are provided. One or more features to be used as predictor variables in the construction of a relevance function are determined. The relevance function is parameterized by one or more coefficients. A query error is defined that measures a difference between a relevance ranking generated by the relevance function and a training set relevance ranking based on a query and a set of scored documents associated with the query. The query error is a continuous function of the coefficients and aims at approximating errors measures commonly used in Information Retrieval. Values for the coefficients of the relevance function are determined that substantially minimize an objective function that depends on the defined query error.

    摘要翻译: 提供了用于产生用于对在搜索中获得的文档进行排序的相关性功能的方法,系统和装置。 确定在构建相关函数中用作预测变量的一个或多个特征。 相关函数由一个或多个系数参数化。 定义了一种查询错误,其测量相关性功能产生的相关性排名与基于查询的一组训练集相关性排序与与该查询相关联的一组计分文档之间的差异。 查询错误是系数的连续函数,旨在近似信息检索中常用的错误度量。 确定相关函数的系数的值,其大大最小化取决于定义的查询错误的目标函数。

    SUPPORT VECTOR MACHINE-BASED METHOD FOR ANALYSIS OF SPECTRAL DATA
    24.
    发明申请
    SUPPORT VECTOR MACHINE-BASED METHOD FOR ANALYSIS OF SPECTRAL DATA 失效
    支持向量机分析光谱数据分析方法

    公开(公告)号:US20100205124A1

    公开(公告)日:2010-08-12

    申请号:US12700575

    申请日:2010-02-04

    IPC分类号: G06F15/18

    摘要: Support vector machines are used to classify data contained within a structured dataset such as a plurality of signals generated by a spectral analyzer. The signals are pre-processed to ensure alignment of peaks across the spectra. Similarity measures are constructed to provide a basis for comparison of pairs of samples of the signal. A support vector machine is trained to discriminate between different classes of the samples. to identify the most predictive features within the spectra. In a preferred embodiment feature selection is performed to reduce the number of features that must be considered.

    摘要翻译: 支持向量机用于对包含在结构化数据集中的数据进行分类,例如由频谱分析仪产生的多个信号。 信号被预处理,以确保谱峰的峰对准。 构建相似性度量以提供用于比较信号样本对的基础。 训练支持向量机以区分不同类别的样本。 以识别光谱中最具预测性的特征。 在优选实施例中,执行特征选择以减少必须考虑的特征的数量。

    KERNELS AND KERNEL METHODS FOR SPECTRAL DATA
    25.
    发明申请
    KERNELS AND KERNEL METHODS FOR SPECTRAL DATA 失效
    用于光谱数据的KERNELS和KERNEL方法

    公开(公告)号:US20080097940A1

    公开(公告)日:2008-04-24

    申请号:US11929169

    申请日:2007-10-30

    IPC分类号: G06F15/18

    摘要: Support vector machines are used to classify data contained within a structured dataset such as a plurality of signals generated by a spectral analyzer. The signals are pre-processed to ensure alignment of peaks across the spectra. Similarity measures are constructed to provide a basis for comparison of pairs of samples of the signal. A support vector machine is trained to discriminate between different classes of the samples. to identify the most predictive features within the spectra. In a preferred embodiment feature selection is performed to reduce the number of features that must be considered.

    摘要翻译: 支持向量机用于对包含在结构化数据集中的数据进行分类,例如由频谱分析仪产生的多个信号。 信号被预处理,以确保谱峰的峰对准。 构建相似性度量以提供用于比较信号样本对的基础。 训练支持向量机以区分不同类别的样本。 以识别光谱中最具预测性的特征。 在优选实施例中,执行特征选择以减少必须考虑的特征的数量。