Kernels and methods for selecting kernels for use in learning machines
    2.
    发明授权
    Kernels and methods for selecting kernels for use in learning machines 失效
    内核和选择用于学习机器的内核的方法

    公开(公告)号:US07788193B2

    公开(公告)日:2010-08-31

    申请号:US11929354

    申请日:2007-10-30

    IPC分类号: G06F15/18 G06F17/00 G06N5/00

    摘要: Learning machines, such as support vector machines, are used to analyze datasets to recognize patterns within the dataset using kernels that are selected according to the nature of the data to be analyzed. Where the datasets possesses structural characteristics, locational kernels can be utilized to provide measures of similarity among data points within the dataset. The locational kernels are then combined to generate a decision function, or kernel, that can be used to analyze the dataset. Where an invariance transformation or noise is present, tangent vectors are defined to identify relationships between the invariance or noise and the data points. A covariance matrix is formed using the tangent vectors, then used in generation of the kernel.

    摘要翻译: 使用学习机器(如支持向量机)分析数据集,以使用根据要分析的数据的性质选择的内核来识别数据集中的模式。 在数据集具有结构特征的情况下,可以利用位置内核提供数据集中的数据点之间的相似度度量。 然后组合位置内核以生成可用于分析数据集的决策函数或内核。 在存在不变变换或噪声的情况下,定义向量以识别不变性或噪声与数据点之间的关系。 使用切向矢量形成协方差矩阵,然后用于生成内核。

    KERNELS AND METHODS FOR SELECTING KERNELS FOR USE IN LEARNING MACHINES
    3.
    发明申请
    KERNELS AND METHODS FOR SELECTING KERNELS FOR USE IN LEARNING MACHINES 失效
    选择用于学习机器的KERNELS的知识和方法

    公开(公告)号:US20080301070A1

    公开(公告)日:2008-12-04

    申请号:US11929354

    申请日:2007-10-30

    IPC分类号: G06F15/18

    摘要: Learning machines, such as support vector machines, are used to analyze datasets to recognize patterns within the dataset using kernels that are selected according to the nature of the data to be analyzed. Where the datasets possesses structural characteristics, locational kernels can be utilized to provide measures of similarity among data points within the dataset. The locational kernels are then combined to generate a decision function, or kernel, that can be used to analyze the dataset. Where an invariance transformation or noise is present, tangent vectors are defined to identify relationships between the invariance or noise and the data points. A covariance matrix is formed using the tangent vectors, then used in generation of the kernel.

    摘要翻译: 使用学习机器(如支持向量机)分析数据集,以使用根据要分析的数据的性质选择的内核来识别数据集中的模式。 在数据集具有结构特征的情况下,可以利用位置内核提供数据集中的数据点之间的相似度度量。 然后组合位置内核以生成可用于分析数据集的决策函数或内核。 在存在不变变换或噪声的情况下,定义向量以识别不变性或噪声与数据点之间的关系。 使用切向矢量形成协方差矩阵,然后用于生成内核。

    Selection of features predictive of biological conditions using protein mass spectrographic data
    4.
    发明授权
    Selection of features predictive of biological conditions using protein mass spectrographic data 失效
    使用蛋白质质谱数据选择预测生物条件的特征

    公开(公告)号:US07676442B2

    公开(公告)日:2010-03-09

    申请号:US11929169

    申请日:2007-10-30

    IPC分类号: G06N5/00

    摘要: Support vector machines are used to classify data contained within a structured dataset such as a plurality of signals generated by a spectral analyzer. The signals are pre-processed to ensure alignment of peaks across the spectra. Similarity measures are constructed to provide a basis for comparison of pairs of samples of the signal. A support vector machine is trained to discriminate between different classes of the samples. to identify the most predictive features within the spectra. In a preferred embodiment feature selection is performed to reduce the number of features that must be considered.

    摘要翻译: 支持向量机用于对包含在结构化数据集中的数据进行分类,例如由频谱分析仪产生的多个信号。 信号被预处理,以确保谱峰的峰对准。 构建相似性度量以提供用于比较信号样本对的基础。 训练支持向量机以区分不同类别的样本。 以识别光谱中最具预测性的特征。 在优选实施例中,执行特征选择以减少必须考虑的特征的数量。

    Kernels and kernel methods for spectral data
    5.
    发明授权
    Kernels and kernel methods for spectral data 有权
    光谱数据的内核和核心方法

    公开(公告)号:US07617163B2

    公开(公告)日:2009-11-10

    申请号:US10267977

    申请日:2002-10-09

    IPC分类号: G06F15/18

    摘要: Support vector machines are used to classify data contained within a structured dataset such as a plurality of signals generated by a spectral analyzer. The signals are pre-processed to ensure alignment of peaks across the spectra. Similarity measures are constructed to provide a basis for comparison of pairs of samples of the signal. A support vector machine is trained to discriminate between different classes of the samples. to identify the most predictive features within the spectra. In a preferred embodiment feature selection is performed to reduce the number of features that must be considered.

    摘要翻译: 支持向量机用于对包含在结构化数据集中的数据进行分类,例如由频谱分析仪产生的多个信号。 信号被预处理,以确保谱峰的峰对准。 构建相似性度量以提供用于比较信号样本对的基础。 训练支持向量机以区分不同类别的样本。 以识别光谱中最具预测性的特征。 在优选实施例中,执行特征选择以减少必须考虑的特征的数量。

    Kernels for identifying patterns in datasets containing noise or transformation invariances
    6.
    发明授权
    Kernels for identifying patterns in datasets containing noise or transformation invariances 有权
    用于识别包含噪声或转换不变性的数据集中的模式的内核

    公开(公告)号:US08209269B2

    公开(公告)日:2012-06-26

    申请号:US12868658

    申请日:2010-08-25

    IPC分类号: G06F15/18 G06F17/00 G06N5/00

    摘要: Learning machines, such as support vector machines, are used to analyze datasets to recognize patterns within the dataset using kernels that are selected according to the nature of the data to be analyzed. Where the datasets include an invariance transformation or noise, tangent vectors are defined to identify relationships between the invariance or noise and the training data points. A covariance matrix is formed using the tangent vectors, then used in generation of the kernel, which may be based on a kernel PCA map.

    摘要翻译: 使用学习机器(如支持向量机)分析数据集,以使用根据要分析的数据的性质选择的内核来识别数据集中的模式。 在数据集包括不变性变换或噪声的情况下,定义向量以识别不变性或噪声与训练数据点之间的关系。 使用正切向量形成协方差矩阵,然后用于生成内核,其可以基于内核PCA映射。

    Support vector machine-based method for analysis of spectral data
    7.
    发明授权
    Support vector machine-based method for analysis of spectral data 失效
    支持向量机分析光谱数据的方法

    公开(公告)号:US08463718B2

    公开(公告)日:2013-06-11

    申请号:US12700575

    申请日:2010-02-04

    IPC分类号: G06F15/18

    摘要: Support vector machines are used to classify data contained within a structured dataset such as a plurality of signals generated by a spectral analyzer. The signals are pre-processed to ensure alignment of peaks across the spectra. Similarity measures are constructed to provide a basis for comparison of pairs of samples of the signal. A support vector machine is trained to discriminate between different classes of the samples. to identify the most predictive features within the spectra. In a preferred embodiment feature selection is performed to reduce the number of features that must be considered.

    摘要翻译: 支持向量机用于对包含在结构化数据集中的数据进行分类,例如由频谱分析仪产生的多个信号。 信号被预处理,以确保谱峰的峰对准。 构建相似性度量以提供用于比较信号样本对的基础。 训练支持向量机以区分不同类别的样本。 以识别光谱中最具预测性的特征。 在优选实施例中,执行特征选择以减少必须考虑的特征的数量。

    Efficient algorithm for pairwise preference learning
    8.
    发明授权
    Efficient algorithm for pairwise preference learning 有权
    用于成对偏好学习的高效算法

    公开(公告)号:US08280829B2

    公开(公告)日:2012-10-02

    申请号:US12504460

    申请日:2009-07-16

    IPC分类号: G06F15/18

    CPC分类号: G06N99/005

    摘要: In one embodiment, training a ranking model comprises: accessing the ranking model and an objective function of the ranking model; accessing one or more preference pairs of objects, wherein for each of the preference pairs of objects comprising a first object and a second object, there is a preference between the first object and the second object with respect to the particular reference, and the first object and the second object each has a feature vector comprising one or more feature values; and training the ranking model by minimizing the objective function using the preference pairs of objects, wherein for each of the preference pairs of objects, a difference between the first feature vector of the first object and the second feature vector of the second object is not calculated.

    摘要翻译: 在一个实施例中,训练排名模型包括:访问排名模型和排名模型的目标函数; 访问一个或多个偏好对对,其中对于包括第一对象和第二对象的对象的每个优选对,在第一对象和第二对象之间存在关于特定引用的偏好,并且第一对象 并且所述第二对象各自具有包括一个或多个特征值的特征向量; 并且通过使用对象的偏好对最小化目标函数来训练排名模型,其中对于每个偏好对的对象,不计算第一对象的第一特征向量与第二对象的第二特征向量之间的差异 。

    System and method for training a multi-class support vector machine to select a common subset of features for classifying objects
    9.
    发明申请
    System and method for training a multi-class support vector machine to select a common subset of features for classifying objects 有权
    用于训练多类支持向量机的系统和方法,以选择用于分类对象的特征的公共子集

    公开(公告)号:US20090150309A1

    公开(公告)日:2009-06-11

    申请号:US12001932

    申请日:2007-12-10

    IPC分类号: G06F15/18

    CPC分类号: G06K9/6249 G06K9/6269

    摘要: An improved system and method is provided for training a multi-class support vector machine to select a common subset of features for classifying objects. A multi-class support vector machine generator may be provided for learning classification functions to classify sets of objects into classes and may include a sparse support vector machine modeling engine for training a multi-class support vector machine using scaling factors by simultaneously selecting a common subset of features iteratively for all classes from sets of features representing each of the classes. An objective function using scaling factors to ensure sparsity of features may be iteratively minimized, and features may be retained and added until a small set of features stabilizes. Alternatively, a common subset of features may be found by iteratively removing at least one feature simultaneously for all classes from an active set of features initialized to represent the entire set of training features.

    摘要翻译: 提供了一种改进的系统和方法,用于训练多类支持向量机以选择用于分类对象的特征的公共子集。 可以提供多类支持向量机生成器用于学习分类功能以将对象集合分类到类中,并且可以包括稀疏支持向量机建模引擎,用于使用缩放因子来同时选择公共子集来训练多类支持向量机 的特征迭代地为表示每个类的特征的集合的所有类。 使用缩放因子以确保特征的稀疏性的目标函数可以被迭代地最小化,并且可以保留和添加特征,直到一小组特征稳定。 或者,可以通过从被初始化为表示整套训练特征的活动特征集合中的所有类别同时迭代地去除至少一个特征来发现特征的公共子集。

    Method and system for distributed machine learning

    公开(公告)号:US09633315B2

    公开(公告)日:2017-04-25

    申请号:US13458545

    申请日:2012-04-27

    IPC分类号: G06N99/00 G06F15/18

    CPC分类号: G06N99/005 G06F15/18

    摘要: Method, system, and programs for distributed machine learning on a cluster including a plurality of nodes are disclosed. A machine learning process is performed in each of the plurality of nodes based on a respective subset of training data to calculate a local parameter. The training data is partitioned over the plurality of nodes. A plurality of operation nodes are determined from the plurality of nodes based on a status of the machine learning process performed in each of the plurality of nodes. The plurality of operation nodes are connected to form a network topology. An aggregated parameter is generated by merging local parameters calculated in each of the plurality of operation nodes in accordance with the network topology.