Identification of co-regulation patterns by unsupervised cluster analysis of gene expression data
    1.
    发明授权
    Identification of co-regulation patterns by unsupervised cluster analysis of gene expression data 失效
    通过基因表达数据的无监督聚类分析鉴定共调控模式

    公开(公告)号:US08489531B2

    公开(公告)日:2013-07-16

    申请号:US13019585

    申请日:2011-02-02

    IPC分类号: G06F17/00 G06N5/00

    摘要: A method is provided for unsupervised clustering of gene expression data to identify co-regulation patterns. A clustering algorithm randomly divides the data into k different subsets and measures the similarity between pairs of datapoints within the subsets, assigning a score to the pairs based on similarity, with the greatest similarity giving the highest correlation score. A distribution of the scores is plotted for each k. The highest value of k that has a distribution that remains concentrated near the highest correlation score corresponds to the number of co-regulation patterns.

    摘要翻译: 提供了用于基因表达数据的无监督聚类以鉴定共调节模式的方法。 聚类算法将数据随机分为k个不同的子集,并测量子集内的数据点对之间的相似度,并根据相似度为该对分配一个分数,最大相似度给出最高相关分数。 为每个k绘制得分的分布。 具有在最高相关分数附近集中的分布的k的最高值对应于协调模式的数量。

    SYSTEM AND METHOD FOR DISTRIBUTED ELICITATION AND AGGREGATION OF RISK INFORMATION
    2.
    发明申请
    SYSTEM AND METHOD FOR DISTRIBUTED ELICITATION AND AGGREGATION OF RISK INFORMATION 审中-公开
    用于分布式激活和风险信息聚合的系统和方法

    公开(公告)号:US20110153383A1

    公开(公告)日:2011-06-23

    申请号:US12640082

    申请日:2009-12-17

    IPC分类号: G06Q10/00 G06N5/02

    摘要: A method and system for the distributed elicitation and aggregation of risk information is provided. The method comprises selecting a risk network, the risk network comprising one or more risk nodes having associated risk information; assigning a role to each risk node, said role indicating a type of user to evaluate the risk node; generating a customized survey to elicit risk information for a risk node based upon the role and the user, wherein an order of questions in the customized survey presented to the user is determined by an ordering criteria; publishing the customized survey to the user; collecting risk information for the risk node from the user's answers to the customized survey; and populating the risk nodes based on the collected risk information.

    摘要翻译: 提供了分布式引导和汇总风险信息的方法和系统。 该方法包括选择风险网络,所述风险网络包括具有相关风险信息的一个或多个风险节点; 为每个风险节点分配角色,所述角色指示用户评估风险节点的类型; 生成定制调查以基于角色和用户为风险节点引出风险信息,其中通过排序标准确定呈现给用户的定制调查中的问题顺序; 向用户发布定制调查; 从用户对定制调查的答案中收集风险节点的风险信息; 并根据收集的风险信息填充风险节点。

    KERNELS AND METHODS FOR SELECTING KERNELS FOR USE IN LEARNING MACHINES
    3.
    发明申请
    KERNELS AND METHODS FOR SELECTING KERNELS FOR USE IN LEARNING MACHINES 失效
    选择用于学习机器的KERNELS的知识和方法

    公开(公告)号:US20080301070A1

    公开(公告)日:2008-12-04

    申请号:US11929354

    申请日:2007-10-30

    IPC分类号: G06F15/18

    摘要: Learning machines, such as support vector machines, are used to analyze datasets to recognize patterns within the dataset using kernels that are selected according to the nature of the data to be analyzed. Where the datasets possesses structural characteristics, locational kernels can be utilized to provide measures of similarity among data points within the dataset. The locational kernels are then combined to generate a decision function, or kernel, that can be used to analyze the dataset. Where an invariance transformation or noise is present, tangent vectors are defined to identify relationships between the invariance or noise and the data points. A covariance matrix is formed using the tangent vectors, then used in generation of the kernel.

    摘要翻译: 使用学习机器(如支持向量机)分析数据集,以使用根据要分析的数据的性质选择的内核来识别数据集中的模式。 在数据集具有结构特征的情况下,可以利用位置内核提供数据集中的数据点之间的相似度度量。 然后组合位置内核以生成可用于分析数据集的决策函数或内核。 在存在不变变换或噪声的情况下,定义向量以识别不变性或噪声与数据点之间的关系。 使用切向矢量形成协方差矩阵,然后用于生成内核。

    METHODS FOR FEATURE SELECTION IN A LEARNING MACHINE
    4.
    发明申请
    METHODS FOR FEATURE SELECTION IN A LEARNING MACHINE 有权
    方法选择学习机中的特征

    公开(公告)号:US20080215513A1

    公开(公告)日:2008-09-04

    申请号:US11929213

    申请日:2007-10-30

    IPC分类号: G06F15/18

    CPC分类号: G06K9/6231 G06N99/005

    摘要: In a pre-processing step prior to training a learning machine, pre-processing includes reducing the quantity of features to be processed using feature selection methods selected from the group consisting of recursive feature elimination (RFE), minimizing the number of non-zero parameters of the system (l0-norm minimization), evaluation of cost function to identify a subset of features that are compatible with constraints imposed by the learning set, unbalanced correlation score and transductive feature selection. The features remaining after feature selection are then used to train a learning machine for purposes of pattern classification, regression, clustering and/or novelty detection.

    摘要翻译: 在训练学习机之前的预处理步骤中,预处理包括使用从递归特征消除(RFE)中选出的特征选择方法来减少要处理的特征量的数量,使非零参数的数量最小化 系统的最小化(最小化),评估成本函数以识别与由学习集施加的约束兼容的特征的子集,不平衡相关得分和转换特征选择。 然后,特征选择之后剩余的特征用于训练学习机,用于模式分类,回归,聚类和/或新颖性检测。

    Methods for feature selection in a learning machine
    5.
    发明授权
    Methods for feature selection in a learning machine 有权
    学习机器中特征选择的方法

    公开(公告)号:US07624074B2

    公开(公告)日:2009-11-24

    申请号:US11929213

    申请日:2007-10-30

    IPC分类号: G06F15/18

    CPC分类号: G06K9/6231 G06N99/005

    摘要: In a pre-processing step prior to training a learning machine, pre-processing includes reducing the quantity of features to be processed using feature selection methods selected from the group consisting of recursive feature elimination (RFE), minimizing the number of non-zero parameters of the system (l0-norm minimization), evaluation of cost function to identify a subset of features that are compatible with constraints imposed by the learning set, unbalanced correlation score and transductive feature selection. The features remaining after feature selection are then used to train a learning machine for purposes of pattern classification, regression, clustering and/or novelty detection.

    摘要翻译: 在训练学习机之前的预处理步骤中,预处理包括使用从递归特征消除(RFE)中选出的特征选择方法来减少要处理的特征量的数量,使非零参数的数量最小化 (10-norm minimization),评估成本函数以识别与由学习集施加的约束兼容的特征的子集,不平衡相关得分和转换特征选择。 然后,特征选择之后剩余的特征用于训练学习机,用于模式分类,回归,聚类和/或新颖性检测。

    MODEL SELECTION FOR CLUSTER DATA ANALYSIS
    6.
    发明申请
    MODEL SELECTION FOR CLUSTER DATA ANALYSIS 失效
    集群数据分析模型选择

    公开(公告)号:US20080140592A1

    公开(公告)日:2008-06-12

    申请号:US11929522

    申请日:2007-10-30

    IPC分类号: G06F15/18

    摘要: A model selection method is provided for choosing the number of clusters, or more generally the parameters of a clustering algorithm. The algorithm is based on comparing the similarity between pairs of clustering runs on sub-samples or other perturbations of the data. High pairwise similarities show that the clustering represents a stable pattern in the data. The method is applicable to any clustering algorithm, and can also detect lack of structure. We show results on artificial and real data using a hierarchical clustering algorithm.

    摘要翻译: 提供了一种模型选择方法,用于选择聚类数量,或更一般地选择聚类算法的参数。 该算法基于比较子样本上的聚类运行对与数据的其他扰动之间的相似性。 高成对相似性表明聚类表示数据中的稳定模式。 该方法适用于任何聚类算法,并且还可以检测到结构不足。 我们使用层次聚类算法来显示人造和实际数据的结果。

    Model selection for cluster data analysis
    7.
    发明申请
    Model selection for cluster data analysis 审中-公开
    集群数据分析的模型选择

    公开(公告)号:US20050071140A1

    公开(公告)日:2005-03-31

    申请号:US10478191

    申请日:2002-05-17

    摘要: A model selection method is provided for choosing the number of clusters, or more generally the parameters of a clustering algorithm. The algorithm is based on comparing the similarity between pairs of clustering runs on sub-samples or other perturbations of the data. High pairwise similarities show that the clustering represents a stable pattern in the data. The method is applicable to any clustering algorithm, and can also detect lack of structure. We show results on artificial and real data using a hierarchical clustering algorithm.

    摘要翻译: 提供了一种模型选择方法,用于选择聚类数量,或更一般地选择聚类算法的参数。 该算法基于比较子样本上的聚类运行对与数据的其他扰动之间的相似性。 高成对相似性表明聚类表示数据中的稳定模式。 该方法适用于任何聚类算法,并且还可以检测到结构不足。 我们使用层次聚类算法来显示人造和实际数据的结果。

    Support vector machine-based method for analysis of spectral data
    8.
    发明授权
    Support vector machine-based method for analysis of spectral data 失效
    支持向量机分析光谱数据的方法

    公开(公告)号:US08463718B2

    公开(公告)日:2013-06-11

    申请号:US12700575

    申请日:2010-02-04

    IPC分类号: G06F15/18

    摘要: Support vector machines are used to classify data contained within a structured dataset such as a plurality of signals generated by a spectral analyzer. The signals are pre-processed to ensure alignment of peaks across the spectra. Similarity measures are constructed to provide a basis for comparison of pairs of samples of the signal. A support vector machine is trained to discriminate between different classes of the samples. to identify the most predictive features within the spectra. In a preferred embodiment feature selection is performed to reduce the number of features that must be considered.

    摘要翻译: 支持向量机用于对包含在结构化数据集中的数据进行分类,例如由频谱分析仪产生的多个信号。 信号被预处理,以确保谱峰的峰对准。 构建相似性度量以提供用于比较信号样本对的基础。 训练支持向量机以区分不同类别的样本。 以识别光谱中最具预测性的特征。 在优选实施例中,执行特征选择以减少必须考虑的特征的数量。

    Methods for feature selection in a learning machine
    9.
    发明申请
    Methods for feature selection in a learning machine 有权
    学习机器中特征选择的方法

    公开(公告)号:US20050216426A1

    公开(公告)日:2005-09-29

    申请号:US10478192

    申请日:2002-05-20

    IPC分类号: G06F15/18 G09B5/00 G09B7/00

    摘要: In a pre-processing step prior to training a learning machine, pre-processing includes reducing the quantity of features to be processed using feature selection methods selected from the group consisting of recursive feature elimination (RFE), minimizing the number of non-zero parameters of the system (lo-norm minimization), evaluation of cost function to identify a subset of features that are compatible with constraints imposed by the learning set, unbalanced correlation score and transductive feature selection. The features remaining after feature selection are then used to train a learning machine for purposes of pattern classification, regression, clustering and/or novelty detection.

    摘要翻译: 在训练学习机之前的预处理步骤中,预处理包括使用从递归特征消除(RFE)中选出的特征选择方法来减少要处理的特征量的数量,使非零参数的数量最小化 的系统(最小化),评估成本函数以识别与由学习集施加的约束兼容的特征的子集,不平衡相关得分和转换特征选择。 然后,特征选择之后剩余的特征用于训练学习机,用于模式分类,回归,聚类和/或新颖性检测。