-
1.
公开(公告)号:US20050071300A1
公开(公告)日:2005-03-31
申请号:US10477078
申请日:2002-05-07
CPC分类号: G06K9/623 , G06F19/24 , G06K9/6215 , G06K9/6248 , G06K9/6269 , G06N99/005 , G06Q20/042
摘要: Kernels (206) for use in learning machines, such as support vector machines, and methods are provided for selection and construction of such kernels are controlled by the nature of the data to be analyzed (203). In particular, data which may possess characteristics such as structure, for example DNA sequences, documents; graphs, signals, such as ECG signals and microarray expression profiles; spectra; images; spatio-temporal data; and relational data, and which may possess invariances or noise components that can interfere with the ability to accurately extract the desired information. Where structured datasets are analyzed, locational kernels are defined to provide measures of similarity among data points (210). The locational kernels are then combined to generate the decision function, or kernel. Where invariance transformations or noise is present, tangent vectors are defined to identify relationships between the invariance or noise and the data points (222). A covariance matrix is formed using the tangent vectors, then used in generation of the kernel.
摘要翻译: 提供用于学习机器(例如支持向量机)和方法的内核(206),用于选择和构建这样的内核,由所要分析的数据的性质来控制(203)。 特别地,可以具有诸如结构的特征的数据,例如DNA序列,文献; 图形,信号,如ECG信号和微阵列表达谱; 光谱; 图片; 时空数据; 和关系数据,并且其可以具有可能干扰准确地提取所需信息的能力的不变性或噪声成分。 在分析结构化数据集的情况下,定位内核以提供数据点之间的相似性度量(210)。 然后组合位置内核以生成决策函数或内核。 在存在不变性变换或噪声的情况下,定义向量以识别不变性或噪声与数据点之间的关系(222)。 使用切向矢量形成协方差矩阵,然后用于生成内核。
-
公开(公告)号:US20080215513A1
公开(公告)日:2008-09-04
申请号:US11929213
申请日:2007-10-30
IPC分类号: G06F15/18
CPC分类号: G06K9/6231 , G06N99/005
摘要: In a pre-processing step prior to training a learning machine, pre-processing includes reducing the quantity of features to be processed using feature selection methods selected from the group consisting of recursive feature elimination (RFE), minimizing the number of non-zero parameters of the system (l0-norm minimization), evaluation of cost function to identify a subset of features that are compatible with constraints imposed by the learning set, unbalanced correlation score and transductive feature selection. The features remaining after feature selection are then used to train a learning machine for purposes of pattern classification, regression, clustering and/or novelty detection.
摘要翻译: 在训练学习机之前的预处理步骤中,预处理包括使用从递归特征消除(RFE)中选出的特征选择方法来减少要处理的特征量的数量,使非零参数的数量最小化 系统的最小化(最小化),评估成本函数以识别与由学习集施加的约束兼容的特征的子集,不平衡相关得分和转换特征选择。 然后,特征选择之后剩余的特征用于训练学习机,用于模式分类,回归,聚类和/或新颖性检测。
-
公开(公告)号:US07624074B2
公开(公告)日:2009-11-24
申请号:US11929213
申请日:2007-10-30
IPC分类号: G06F15/18
CPC分类号: G06K9/6231 , G06N99/005
摘要: In a pre-processing step prior to training a learning machine, pre-processing includes reducing the quantity of features to be processed using feature selection methods selected from the group consisting of recursive feature elimination (RFE), minimizing the number of non-zero parameters of the system (l0-norm minimization), evaluation of cost function to identify a subset of features that are compatible with constraints imposed by the learning set, unbalanced correlation score and transductive feature selection. The features remaining after feature selection are then used to train a learning machine for purposes of pattern classification, regression, clustering and/or novelty detection.
摘要翻译: 在训练学习机之前的预处理步骤中,预处理包括使用从递归特征消除(RFE)中选出的特征选择方法来减少要处理的特征量的数量,使非零参数的数量最小化 (10-norm minimization),评估成本函数以识别与由学习集施加的约束兼容的特征的子集,不平衡相关得分和转换特征选择。 然后,特征选择之后剩余的特征用于训练学习机,用于模式分类,回归,聚类和/或新颖性检测。
-
-