-
公开(公告)号:US07937351B2
公开(公告)日:2011-05-03
申请号:US12356061
申请日:2009-01-19
CPC分类号: G06F17/30595 , G06F2216/03 , G06K9/6253 , G06K9/6269
摘要: An implementation of SVM functionality improves efficiency, time consumption, and data security, reduces the parameter tuning challenges presented to the inexperienced user, and reduces the computational costs of building SVM models. A computer program product for support vector machine processing in a computer system comprises computer program instructions for storing data, providing an interface to client software, building a support vector machine model on at least a portion of the stored data, based on a plurality of model-building parameters, estimating values for at least some of the model-building parameters, and applying the support vector machine model using the stored data to generate a data mining output.
摘要翻译: SVM功能的实现提高了效率,时间消耗和数据安全性,减少了对经验不足的用户提出的参数调优挑战,并降低了构建SVM模型的计算成本。 一种用于在计算机系统中支持向量机处理的计算机程序产品包括用于存储数据的计算机程序指令,向客户端软件提供接口,基于多个模型在至少一部分存储的数据上构建支持向量机模型 建模参数,估计至少一些模型建立参数的值,以及使用所存储的数据应用支持向量机模型以生成数据挖掘输出。
-
公开(公告)号:US07747624B2
公开(公告)日:2010-06-29
申请号:US10424850
申请日:2003-04-29
IPC分类号: G06F17/30
CPC分类号: G06F17/30598 , G06F17/30601 , G06K9/6223 , G06K9/6226 , Y10S707/968
摘要: A database management system provides the capability to perform cluster analysis and provides improved performance in model building and data mining, good integration with the various databases throughout the enterprise, and flexible specification and adjustment of the models being built, but which provides data mining functionality that is accessible to users having limited data mining expertise and which provides reductions in development times and costs for data mining projects. A database management system for in-database clustering comprises a first data table and a second data table, each data table including a plurality of rows of data, means for building a clustering model using the first data table using a portion of the first data table, wherein the portion of the first data table is selected by partitioning, density summarization, or active sampling of the first data table, and means for applying the clustering model using the second data table to generate apply output data.
摘要翻译: 数据库管理系统提供执行集群分析的能力,并在模型构建和数据挖掘中提供改进的性能,与整个企业中的各种数据库的良好集成,以及正在构建的模型的灵活规范和调整,但提供数据挖掘功能 对于具有有限的数据挖掘专业知识的用户可以访问,并减少数据挖掘项目的开发时间和成本。 用于数据库内聚类的数据库管理系统包括第一数据表和第二数据表,每个数据表包括多行数据,用于使用第一数据表的一部分使用第一数据表建立聚类模型的装置 其中,通过第一数据表的分区,密度聚合或主动采样来选择第一数据表的部分,以及使用第二数据表应用聚类模型以生成应用输出数据的装置。
-
公开(公告)号:US08325748B2
公开(公告)日:2012-12-04
申请号:US11519781
申请日:2006-09-13
申请人: Marcos M. Campos
发明人: Marcos M. Campos
CPC分类号: G06K9/6251 , G06K9/6224
摘要: A new process called a vector approximation graph (VA-graph) leverages a tree based vector quantizer to quickly learn the topological structure of the data. It then uses the learned topology to enhance the performance of the vector quantizer. A method for analyzing data comprises receiving data, partitioning the data and generating a tree based on the partitions, learning a topology of a distribution of the data, and finding a best matching unit in the data using the learned topology.
摘要翻译: 称为矢量近似图(VA-graph)的新过程利用基于树的矢量量化器来快速了解数据的拓扑结构。 然后使用学习的拓扑来增强矢量量化器的性能。 一种用于分析数据的方法包括:接收数据,划分数据并基于分区生成树,学习数据分布的拓扑,以及使用学习的拓扑结构在数据中找到最佳匹配单元。
-
公开(公告)号:US07849032B1
公开(公告)日:2010-12-07
申请号:US10153607
申请日:2002-05-24
IPC分类号: G06N3/08
CPC分类号: G06N3/08
摘要: A method, system, and computer program product provides automated determination of the size of the sample that is to be used in training a neural network data mining model that is large enough to properly train the neural network data mining model, yet is no larger than is necessary. A method of performing training of a neural network data mining model comprises the steps of: a) providing a training dataset for training an untrained neural network data mining model, the first training dataset comprising a plurality of rows of data, b) selecting a row of data from the training dataset for performing training processing on the neural network data mining model, c) computing an estimate of a gradient or cost function of the neural network data mining model, d) determining whether the gradient or cost function of the neural network data mining model has converged, based on the computed estimate of the gradient or cost function of the neural network data mining model, e) repeating steps b)-d), if the gradient or cost function of the neural network data mining model has not converged, and f) updating weights of the neural network data mining model, if the gradient or cost function of the neural network data mining model has converged.
摘要翻译: 方法,系统和计算机程序产品自动确定要用于训练神经网络数据挖掘模型的样本的大小,该模型足够大以适当地训练神经网络数据挖掘模型,但不大于 是必要的。 执行神经网络数据挖掘模型的训练的方法包括以下步骤:a)提供用于训练未经训练的神经网络数据挖掘模型的训练数据集,所述第一训练数据集包括多行数据,b)选择一行 来自用于对神经网络数据挖掘模型执行训练处理的训练数据集的数据,c)计算神经网络数据挖掘模型的梯度或成本函数的估计,d)确定神经网络的梯度或成本函数 数据挖掘模型基于神经网络数据挖掘模型的梯度或成本函数的计算估计而收敛,e)重复步骤b)-d),如果神经网络数据挖掘模型的梯度或成本函数没有 如果神经网络数据挖掘模型的梯度或成本函数已经收敛,则融合,以及f)更新神经网络数据挖掘模型的权重。
-
公开(公告)号:US07092941B1
公开(公告)日:2006-08-15
申请号:US10152574
申请日:2002-05-23
申请人: Marcos M. Campos
发明人: Marcos M. Campos
IPC分类号: G06F17/30
CPC分类号: G06F17/30598 , G06F17/30539 , G06K9/6218 , G06K9/6253 , Y10S707/99936
摘要: A system, software module, and computer program product for performing clustering based data mining that improved performance in model building, good integration with the various databases throughout the enterprise, flexible specification and adjustment of the models being built, and flexible model arrangement and export capability. The software module for performing clustering based data mining in an electronic data processing system comprises: a model setup block operable to receive client input including information specifying a setup of a clustering data mining models, generate the model setup, and generate parameters for the model setup based on the received information, a modeling algorithms block operable to select and initialize a clustering modeling algorithm based on the generated model setup, a model building block operable to receive training data and build a clustering model using the training data and the selected clustering modeling algorithm and a model scoring block operable to receive scoring data and generate predictions and/or recommendations using the scoring data and the clustering model.
摘要翻译: 一种用于执行基于群集的数据挖掘的系统,软件模块和计算机程序产品,其改进了模型构建中的性能,与整个企业中的各种数据库的良好集成,正在建立的模型的灵活的规范和调整,以及灵活的模型布置和导出能力 。 用于在电子数据处理系统中执行基于群集的数据挖掘的软件模块包括:模型设置块,其可操作以接收包括指定聚类数据挖掘模型的设置的信息的客户端输入,生成模型设置并生成用于模型设置的参数 基于所接收的信息,建模算法块,其可操作以基于所生成的模型设置来选择和初始化聚类建模算法;模型构建块,其可操作以接收训练数据并使用训练数据和所选择的聚类建模算法构建聚类模型 以及模型评分块,其可操作以使用评分数据和聚类模型来接收评分数据并生成预测和/或建议。
-
16.
公开(公告)号:US07565370B2
公开(公告)日:2009-07-21
申请号:US10927024
申请日:2004-08-27
IPC分类号: G06F17/00
CPC分类号: G06F17/30598 , Y10S707/99933 , Y10S707/99943
摘要: An implementation of SVM functionality integrated into a relational database management system (RDBMS) improves efficiency, time consumption, and data security, reduces the parameter tuning challenges presented to the inexperienced user, and reduces the computational costs of building SVM models. A database management system comprises data stored in the database management system and a processing unit comprising a client application programming interface operable to provide an interface to client software, a build unit operable to build a support vector machine model on at least a portion of the data stored in the database management system, and an apply unit operable to apply the support vector machine model using the data stored in the database management system. The database management system may be a relational database management system.
摘要翻译: 集成到关系数据库管理系统(RDBMS)中的SVM功能的实现提高了效率,时间消耗和数据安全性,减少了对经验不足的用户提出的参数调优挑战,并降低了构建SVM模型的计算成本。 数据库管理系统包括存储在数据库管理系统中的数据和处理单元,该处理单元包括可操作以向客户端软件提供接口的客户端应用编程接口,可操作以在数据的至少一部分上构建支持向量机模型的构建单元 存储在数据库管理系统中的应用单元,以及可以使用存储在数据库管理系统中的数据应用支持向量机模型的应用单元。 数据库管理系统可以是关系数据库管理系统。
-
公开(公告)号:US07069256B1
公开(公告)日:2006-06-27
申请号:US10152731
申请日:2002-05-23
申请人: Marcos M. Campos
发明人: Marcos M. Campos
IPC分类号: G06F15/18
CPC分类号: G06N3/02 , G06F17/30395 , G06F17/30522
摘要: A system, software module, and computer program product for performing neural network based data mining that improved performance in model building, good integration with the various databases throughout the enterprise, flexible specification and adjustment of the models being built, and flexible model arrangement and export capability. The software module for performing neural network based data mining in an electronic data processing system comprises: a model setup block operable to receive client input including information specifying a setup of a neural network data mining models, generate the model setup, generate parameters for the model setup based on the received information, a modeling algorithms block operable to select and initialize a neural network modeling algorithm based on the generated model setup, a model building block operable to receive training data and build a neural network model using the training data and the selected neural network modeling algorithm and a model scoring block operable to receive scoring data and generate predictions and/or recommendations using the scoring data and the neural network model.
摘要翻译: 一种用于执行基于神经网络的数据挖掘的系统,软件模块和计算机程序产品,其改进了模型构建中的性能,与整个企业中的各种数据库的良好集成,正在建立的模型的灵活的规范和调整以及灵活的模型布置和出口 能力。 用于在电子数据处理系统中执行基于神经网络的数据挖掘的软件模块包括:模型设置块,用于接收包括指定神经网络数据挖掘模型的设置的信息的客户端输入,生成模型设置,为模型生成参数 基于所接收的信息进行设置,建模算法块,其可操作以基于所生成的模型设置来选择和初始化神经网络建模算法;模型构建块,其可操作以接收训练数据并使用所述训练数据和所选择的所述训练数据构建神经网络模型 神经网络建模算法和模型评分块,其可操作以使用评分数据和神经网络模型接收评分数据并生成预测和/或建议。
-
公开(公告)号:US08781978B2
公开(公告)日:2014-07-15
申请号:US12356063
申请日:2009-01-19
CPC分类号: G06F17/30595 , G06F2216/03 , G06K9/6253 , G06K9/6269
摘要: An implementation of SVM functionality improves efficiency, time consumption, and data security, reduces the parameter tuning challenges presented to the inexperienced user, and reduces the computational costs of building SVM models. A system for support vector machine processing comprises data stored in the system, a client application programming interface operable to provide an interface to client software, a build unit operable to build a support vector machine model on at least a portion of the data stored in the system, the portion of the data selected using a stratified sampling method with respect to a target distribution, an apply unit operable to apply the support vector machine model using the data stored in the system.
摘要翻译: SVM功能的实现提高了效率,时间消耗和数据安全性,减少了对经验不足的用户提出的参数调优挑战,并降低了构建SVM模型的计算成本。 用于支持向量机处理的系统包括存储在系统中的数据,可操作以向客户端软件提供接口的客户端应用程序编程接口,可构建单元,用于在存储在所述系统中的数据的至少一部分上构建支持向量机模型 系统,使用关于目标分布的分层采样方法选择的数据的部分;应用单元,可操作以使用存储在系统中的数据应用支持向量机模型。
-
公开(公告)号:US07627620B2
公开(公告)日:2009-12-01
申请号:US11012350
申请日:2004-12-16
CPC分类号: G06F17/30598 , G06F17/30539 , Y10S707/99931 , Y10S707/99933 , Y10S707/99935 , Y10S707/99956
摘要: A data-centric data mining technique provides greater ease of use and flexibility, yet provides high quality data mining results by providing general methodologies for automatic data mining. A methodology for each major type of mining function is provided, including: supervised modeling (classification and regression), feature selection, and ranking, clustering, outlier detection, projection of the data to lower dimensionality, association discovery, and data source comparison. A method for data-centric data mining comprises invoking a data mining feature to perform data mining on a data source, performing data mining on data from the data source using the data mining feature, wherein the data mining feature uses data mining processes and objects internal to the data mining feature and does not use data mining processes and objects external to the data mining feature, outputting data mining results from the data mining feature, and removing all data mining processes and objects internal to the data mining feature that were used to process the data from the data source.
摘要翻译: 以数据为中心的数据挖掘技术提供更大的易用性和灵活性,并通过提供自动数据挖掘的一般方法来提供高质量的数据挖掘结果。 提供了每种主要类型挖掘功能的方法,包括:监督建模(分类和回归),特征选择和排序,聚类,异常值检测,数据投影到较低维度,关联发现和数据源比较。 一种以数据为中心的数据挖掘的方法包括调用数据挖掘特征来对数据源进行数据挖掘,使用数据挖掘特征对来自数据源的数据进行数据挖掘,其中数据挖掘特征使用数据挖掘过程和对象内部 到数据挖掘功能,并且不使用数据挖掘功能外部的数据挖掘过程和对象,从数据挖掘功能输出数据挖掘结果,以及删除用于处理的数据挖掘功能内部的所有数据挖掘过程和对象 来自数据源的数据。
-
公开(公告)号:US07490071B2
公开(公告)日:2009-02-10
申请号:US10927111
申请日:2004-08-27
CPC分类号: G06F17/30595 , G06F2216/03 , G06K9/6253 , G06K9/6269
摘要: An implementation of SVM functionality improves efficiency, time consumption, and data security, reduces the parameter tuning challenges presented to the inexperienced user, and reduces the computational costs of building SVM models. A system for support vector machine processing comprises data stored in the system, a client application programming interface operable to provide an interface to client software, a build unit operable to build a support vector machine model on at least a portion of the data stored in the system, based on a plurality of model-building parameters, a parameter estimation unit operable to estimate values for at least some of the model-building parameters, and an apply unit operable to apply the support vector machine model using the data stored in the system.
摘要翻译: SVM功能的实现提高了效率,时间消耗和数据安全性,减少了对经验不足的用户提出的参数调优挑战,并降低了构建SVM模型的计算成本。 用于支持向量机处理的系统包括存储在系统中的数据,可操作以向客户端软件提供接口的客户端应用程序编程接口,可构建单元,用于在存储在所述系统中的数据的至少一部分上构建支持向量机模型 系统,基于多个模型构建参数,参数估计单元,其可操作以估计至少一些模型建立参数的值;以及应用单元,可操作以使用存储在系统中的数据应用支持向量机模型 。
-
-
-
-
-
-
-
-
-