摘要:
An implementation of NMF functionality integrated into a relational database management system provides the capability to apply NMF to relational datasets and to sparse datasets. A database management system comprises a multi-dimensional data table operable to store data and a processing unit operable to perform non-negative matrix factorization on data stored in the multi-dimensional data table and to generate a plurality of data tables, each data table being smaller than the multi-dimensional data table and having reduced dimensionality relative to the multi-dimensional data table. The multi-dimensional data table may be a relational data table.
摘要:
An implementation of SVM functionality integrated into a relational database management system (RDBMS) improves efficiency, time consumption, and data security, reduces the parameter tuning challenges presented to the inexperienced user, and reduces the computational costs of building SVM models. A database management system comprises data stored in the database management system and a processing unit comprising a client application programming interface operable to provide an interface to client software, a build unit operable to build a support vector machine model on at least a portion of the data stored in the database management system, and an apply unit operable to apply the support vector machine model using the data stored in the database management system. The database management system may be a relational database management system.
摘要:
An implementation of SVM functionality improves efficiency, time consumption, and data security, reduces the parameter tuning challenges presented to the inexperienced user, and reduces the computational costs of building SVM models. A system for support vector machine processing comprises data stored in the system, a client application programming interface operable to provide an interface to client software, a build unit operable to build a support vector machine model on at least a portion of the data stored in the system, based on a plurality of model-building parameters, a parameter estimation unit operable to estimate values for at least some of the model-building parameters, and an apply unit operable to apply the support vector machine model using the data stored in the system.
摘要:
A system and computer program product provides data mining model deployment (scoring) functionality as a family of SQL functions (operators). A database management system comprises a processor operable to execute computer program instructions, a memory operable to store computer program instructions executable by the processor, and computer program instructions stored in the memory and executable to implement a plurality of database query language statements, each statement operable to cause a data mining function to be performed.
摘要:
A data-centric data mining technique provides greater ease of use and flexibility, yet provides high quality data mining results by providing general methodologies for automatic data mining. A methodology for each major type of mining function is provided, including: supervised modeling (classification and regression), feature selection, and ranking, clustering, outlier detection, projection of the data to lower dimensionality, association discovery, and data source comparison. A method for data-centric data mining comprises invoking a data mining feature to perform data mining on a data source, performing data mining on data from the data source using the data mining feature, wherein the data mining feature uses data mining processes and objects internal to the data mining feature and does not use data mining processes and objects external to the data mining feature, outputting data mining results from the data mining feature, and removing all data mining processes and objects internal to the data mining feature that were used to process the data from the data source.
摘要:
The present invention relates to progress notification systems, computer program products and methods of operation thereof, that reports processing progress of data mining operations at regular periodic intervals. The system comprises: an input/output interface for exchanging information with a network; a memory for storing updated progress objects associated with the data mining operation as a set of data mining algorithms progress in processing; and a processor coupled to the input/output interface and the memory, the processor for performing the data mining operation, the data mining operation implementing the set of data mining algorithms; and generating a notification object for the data mining operation at a pre-determined interval, the notification object based on the progress objects at each of the pre-determined intervals.
摘要:
An enterprise-wide web data mining system, computer program product, and method of operation thereof, that uses Internet based data sources, and which operates in an automated and cost effective manner. The enterprise web mining system comprises: a database coupled to a plurality of data sources, the database operable to store data collected from the data sources; a data mining engine coupled to the web server and the database, the data mining engine operable to generate a plurality of data mining models using the collected data; a server coupled to a network, the server operable to: receive a request for a prediction or recommendation over the network, generate a prediction or recommendation using the data mining models, and transmit the generated prediction or recommendation.
摘要:
An enterprise-wide web data mining system, computer program product, and method of operation thereof, that uses Internet based data sources, and which operates in an automated and cost effective manner. The enterprise web mining system comprises: a database coupled to a plurality of data sources, the database operable to store data collected from the data sources; a data mining engine coupled to the web server and the database, the data mining engine operable to generate a plurality of data mining models using the collected data; a server coupled to a network, the server operable to: receive a request for a prediction or recommendation over the network, generate a prediction or recommendation using the data mining models, and transmit the generated prediction or recommendation.
摘要:
An enterprise-wide web data mining system, computer program product, and method of operation thereof, that uses Internet based data sources, and which operates in an automated and cost effective manner. The enterprise web mining system comprises: a database coupled to a plurality of data sources, the database operable to store data collected from the data sources; a data mining engine coupled to the web server and the database, the data mining engine operable to generate a plurality of data mining models using the collected data; a server coupled to a network, the server operable to: receive a request for a prediction or recommendation over the network, generate a prediction or recommendation using the data mining models, and transmit the generated prediction or recommendation.
摘要:
A new process called a vector approximation graph (VA-graph) leverages a tree based vector quantizer to quickly learn the topological structure of the data. It then uses the learned topology to enhance the performance of the vector quantizer. A method for analyzing data comprises receiving data, partitioning the data and generating a tree based on the partitions, learning a topology of a distribution of the data, and finding a best matching unit in the data using the learned topology.