SYSTEM AND METHOD FOR DISTRIBUTED ELICITATION AND AGGREGATION OF RISK INFORMATION
    1.
    发明申请
    SYSTEM AND METHOD FOR DISTRIBUTED ELICITATION AND AGGREGATION OF RISK INFORMATION 审中-公开
    用于分布式激活和风险信息聚合的系统和方法

    公开(公告)号:US20110153383A1

    公开(公告)日:2011-06-23

    申请号:US12640082

    申请日:2009-12-17

    IPC分类号: G06Q10/00 G06N5/02

    摘要: A method and system for the distributed elicitation and aggregation of risk information is provided. The method comprises selecting a risk network, the risk network comprising one or more risk nodes having associated risk information; assigning a role to each risk node, said role indicating a type of user to evaluate the risk node; generating a customized survey to elicit risk information for a risk node based upon the role and the user, wherein an order of questions in the customized survey presented to the user is determined by an ordering criteria; publishing the customized survey to the user; collecting risk information for the risk node from the user's answers to the customized survey; and populating the risk nodes based on the collected risk information.

    摘要翻译: 提供了分布式引导和汇总风险信息的方法和系统。 该方法包括选择风险网络,所述风险网络包括具有相关风险信息的一个或多个风险节点; 为每个风险节点分配角色,所述角色指示用户评估风险节点的类型; 生成定制调查以基于角色和用户为风险节点引出风险信息,其中通过排序标准确定呈现给用户的定制调查中的问题顺序; 向用户发布定制调查; 从用户对定制调查的答案中收集风险节点的风险信息; 并根据收集的风险信息填充风险节点。

    MODEL SELECTION FOR CLUSTER DATA ANALYSIS
    2.
    发明申请
    MODEL SELECTION FOR CLUSTER DATA ANALYSIS 失效
    集群数据分析模型选择

    公开(公告)号:US20080140592A1

    公开(公告)日:2008-06-12

    申请号:US11929522

    申请日:2007-10-30

    IPC分类号: G06F15/18

    摘要: A model selection method is provided for choosing the number of clusters, or more generally the parameters of a clustering algorithm. The algorithm is based on comparing the similarity between pairs of clustering runs on sub-samples or other perturbations of the data. High pairwise similarities show that the clustering represents a stable pattern in the data. The method is applicable to any clustering algorithm, and can also detect lack of structure. We show results on artificial and real data using a hierarchical clustering algorithm.

    摘要翻译: 提供了一种模型选择方法,用于选择聚类数量,或更一般地选择聚类算法的参数。 该算法基于比较子样本上的聚类运行对与数据的其他扰动之间的相似性。 高成对相似性表明聚类表示数据中的稳定模式。 该方法适用于任何聚类算法,并且还可以检测到结构不足。 我们使用层次聚类算法来显示人造和实际数据的结果。

    Model selection for cluster data analysis
    3.
    发明申请
    Model selection for cluster data analysis 审中-公开
    集群数据分析的模型选择

    公开(公告)号:US20050071140A1

    公开(公告)日:2005-03-31

    申请号:US10478191

    申请日:2002-05-17

    摘要: A model selection method is provided for choosing the number of clusters, or more generally the parameters of a clustering algorithm. The algorithm is based on comparing the similarity between pairs of clustering runs on sub-samples or other perturbations of the data. High pairwise similarities show that the clustering represents a stable pattern in the data. The method is applicable to any clustering algorithm, and can also detect lack of structure. We show results on artificial and real data using a hierarchical clustering algorithm.

    摘要翻译: 提供了一种模型选择方法,用于选择聚类数量,或更一般地选择聚类算法的参数。 该算法基于比较子样本上的聚类运行对与数据的其他扰动之间的相似性。 高成对相似性表明聚类表示数据中的稳定模式。 该方法适用于任何聚类算法,并且还可以检测到结构不足。 我们使用层次聚类算法来显示人造和实际数据的结果。

    Support vector machine-based method for analysis of spectral data
    4.
    发明授权
    Support vector machine-based method for analysis of spectral data 失效
    支持向量机分析光谱数据的方法

    公开(公告)号:US08463718B2

    公开(公告)日:2013-06-11

    申请号:US12700575

    申请日:2010-02-04

    IPC分类号: G06F15/18

    摘要: Support vector machines are used to classify data contained within a structured dataset such as a plurality of signals generated by a spectral analyzer. The signals are pre-processed to ensure alignment of peaks across the spectra. Similarity measures are constructed to provide a basis for comparison of pairs of samples of the signal. A support vector machine is trained to discriminate between different classes of the samples. to identify the most predictive features within the spectra. In a preferred embodiment feature selection is performed to reduce the number of features that must be considered.

    摘要翻译: 支持向量机用于对包含在结构化数据集中的数据进行分类,例如由频谱分析仪产生的多个信号。 信号被预处理,以确保谱峰的峰对准。 构建相似性度量以提供用于比较信号样本对的基础。 训练支持向量机以区分不同类别的样本。 以识别光谱中最具预测性的特征。 在优选实施例中,执行特征选择以减少必须考虑的特征的数量。

    Methods for feature selection in a learning machine
    5.
    发明申请
    Methods for feature selection in a learning machine 有权
    学习机器中特征选择的方法

    公开(公告)号:US20050216426A1

    公开(公告)日:2005-09-29

    申请号:US10478192

    申请日:2002-05-20

    IPC分类号: G06F15/18 G09B5/00 G09B7/00

    摘要: In a pre-processing step prior to training a learning machine, pre-processing includes reducing the quantity of features to be processed using feature selection methods selected from the group consisting of recursive feature elimination (RFE), minimizing the number of non-zero parameters of the system (lo-norm minimization), evaluation of cost function to identify a subset of features that are compatible with constraints imposed by the learning set, unbalanced correlation score and transductive feature selection. The features remaining after feature selection are then used to train a learning machine for purposes of pattern classification, regression, clustering and/or novelty detection.

    摘要翻译: 在训练学习机之前的预处理步骤中,预处理包括使用从递归特征消除(RFE)中选出的特征选择方法来减少要处理的特征量的数量,使非零参数的数量最小化 的系统(最小化),评估成本函数以识别与由学习集施加的约束兼容的特征的子集,不平衡相关得分和转换特征选择。 然后,特征选择之后剩余的特征用于训练学习机,用于模式分类,回归,聚类和/或新颖性检测。

    Kernels and methods for selecting kernels for use in learning machines
    7.
    发明申请
    Kernels and methods for selecting kernels for use in learning machines 有权
    内核和选择用于学习机器的内核的方法

    公开(公告)号:US20050071300A1

    公开(公告)日:2005-03-31

    申请号:US10477078

    申请日:2002-05-07

    摘要: Kernels (206) for use in learning machines, such as support vector machines, and methods are provided for selection and construction of such kernels are controlled by the nature of the data to be analyzed (203). In particular, data which may possess characteristics such as structure, for example DNA sequences, documents; graphs, signals, such as ECG signals and microarray expression profiles; spectra; images; spatio-temporal data; and relational data, and which may possess invariances or noise components that can interfere with the ability to accurately extract the desired information. Where structured datasets are analyzed, locational kernels are defined to provide measures of similarity among data points (210). The locational kernels are then combined to generate the decision function, or kernel. Where invariance transformations or noise is present, tangent vectors are defined to identify relationships between the invariance or noise and the data points (222). A covariance matrix is formed using the tangent vectors, then used in generation of the kernel.

    摘要翻译: 提供用于学习机器(例如支持向量机)和方法的内核(206),用于选择和构建这样的内核,由所要分析的数据的性质来控制(203)。 特别地,可以具有诸如结构的特征的数据,例如DNA序列,文献; 图形,信号,如ECG信号和微阵列表达谱; 光谱; 图片; 时空数据; 和关系数据,并且其可以具有可能干扰准确地提取所需信息的能力的不变性或噪声成分。 在分析结构化数据集的情况下,定位内核以提供数据点之间的相似性度量(210)。 然后组合位置内核以生成决策函数或内核。 在存在不变性变换或噪声的情况下,定义向量以识别不变性或噪声与数据点之间的关系(222)。 使用切向矢量形成协方差矩阵,然后用于生成内核。

    Method for feature selection and for evaluating features identified as significant for classifying data
    8.
    发明授权
    Method for feature selection and for evaluating features identified as significant for classifying data 有权
    用于特征选择和评估对分类数据有重要意义的特征的方法

    公开(公告)号:US07970718B2

    公开(公告)日:2011-06-28

    申请号:US12890705

    申请日:2010-09-26

    IPC分类号: G06F15/18

    摘要: A group of features that has been identified as “significant” in being able to separate data into classes is evaluated using a support vector machine which separates the dataset into classes one feature at a time. After separation, an extremal margin value is assigned to each feature based on the distance between the lowest feature value in the first class and the highest feature value in the second class. Separately, extremal margin values are calculated for a normal distribution within a large number of randomly drawn example sets for the two classes to determine the number of examples within the normal distribution that would have a specified extremal margin value. Using p-values calculated for the normal distribution, a desired p-value is selected. The specified extremal margin value corresponding to the selected p-value is compared to the calculated extremal margin values for the group of features. The features in the group that have a calculated extremal margin value less than the specified margin value are labeled as falsely significant.

    摘要翻译: 使用支持向量机将资源分为类别的“特征”组合进行评估,该支持向量机将数据集一次分为一个特征。 分离后,基于第一类中最低特征值与第二类中最高特征值之间的距离,为每个特征分配极值边缘值。 另外,对于两个类别的大量随机绘制的示例集合中的正态分布计算极值边界值,以确定具有指定的极值边界值的正态分布内的示例的数量。 使用为正态分布计算的p值,选择所需的p值。 对应于所选择的p值的指定极值余量值与所计算的特征组的极值边际值进行比较。 计算的极值余量值小于指定余量值的组中的特征被标记为错误显着。

    SUPPORT VECTOR MACHINE-BASED METHOD FOR ANALYSIS OF SPECTRAL DATA
    9.
    发明申请
    SUPPORT VECTOR MACHINE-BASED METHOD FOR ANALYSIS OF SPECTRAL DATA 失效
    支持向量机分析光谱数据分析方法

    公开(公告)号:US20100205124A1

    公开(公告)日:2010-08-12

    申请号:US12700575

    申请日:2010-02-04

    IPC分类号: G06F15/18

    摘要: Support vector machines are used to classify data contained within a structured dataset such as a plurality of signals generated by a spectral analyzer. The signals are pre-processed to ensure alignment of peaks across the spectra. Similarity measures are constructed to provide a basis for comparison of pairs of samples of the signal. A support vector machine is trained to discriminate between different classes of the samples. to identify the most predictive features within the spectra. In a preferred embodiment feature selection is performed to reduce the number of features that must be considered.

    摘要翻译: 支持向量机用于对包含在结构化数据集中的数据进行分类,例如由频谱分析仪产生的多个信号。 信号被预处理,以确保谱峰的峰对准。 构建相似性度量以提供用于比较信号样本对的基础。 训练支持向量机以区分不同类别的样本。 以识别光谱中最具预测性的特征。 在优选实施例中,执行特征选择以减少必须考虑的特征的数量。

    SYSTEM AND METHOD FOR ANALYZING ELECTRONIC DATA RECORDS
    10.
    发明申请
    SYSTEM AND METHOD FOR ANALYZING ELECTRONIC DATA RECORDS 有权
    用于分析电子数据记录的系统和方法

    公开(公告)号:US20090083231A1

    公开(公告)日:2009-03-26

    申请号:US12212950

    申请日:2008-09-18

    IPC分类号: G06F17/30

    摘要: A system and method for analyzing electronic data records including an annotation unit being operable to receive a set of electronic data records and to compute concept vectors for the set of electronic data records, wherein the coordinates of the concept vectors represent scores of the concepts in the respective electronic data record and wherein the concepts are part of an ontology, a similarity network unit being operable to compute a similarity network by means of the concept vectors and by at least one relationship between the concepts of the ontology, the similarity network representing similarities between the electronic data records, wherein the vertices of the similarity network represent the electronic data records and the edges of the similarity network represent similarity values indicating a degree of similarity between the vertices and steps for executing the system.

    摘要翻译: 一种用于分析电子数据记录的系统和方法,包括注释单元,其可操作以接收一组电子数据记录并计算该组电子数据记录的概念向量,其中概念向量的坐标表示 相应的电子数据记录,并且其中所述概念是本体的一部分,相似性网络单元可操作以借助于所述概念向量和所述本体的概念之间的至少一个关系来计算相似性网络,所述相似性网络表示相似性网络, 电子数据记录,其中相似网络的顶点表示电子数据记录,并且相似性网络的边缘表示表示用于执行系统的顶点和步骤之间的相似度的相似性值。