摘要:
An implementation of NMF functionality integrated into a relational database management system provides the capability to apply NMF to relational datasets and to sparse datasets. A database management system comprises a multi-dimensional data table operable to store data and a processing unit operable to perform non-negative matrix factorization on data stored in the multi-dimensional data table and to generate a plurality of data tables, each data table being smaller than the multi-dimensional data table and having reduced dimensionality relative to the multi-dimensional data table. The multi-dimensional data table may be a relational data table.
摘要:
An implementation of SVM functionality integrated into a relational database management system (RDBMS) improves efficiency, time consumption, and data security, reduces the parameter tuning challenges presented to the inexperienced user, and reduces the computational costs of building SVM models. A database management system comprises data stored in the database management system and a processing unit comprising a client application programming interface operable to provide an interface to client software, a build unit operable to build a support vector machine model on at least a portion of the data stored in the database management system, and an apply unit operable to apply the support vector machine model using the data stored in the database management system. The database management system may be a relational database management system.
摘要:
An implementation of SVM functionality improves efficiency, time consumption, and data security, reduces the parameter tuning challenges presented to the inexperienced user, and reduces the computational costs of building SVM models. A computer program product for support vector machine processing in a computer system comprises computer program instructions for storing data, providing an interface to client software, building a support vector machine model on at least a portion of the stored data, based on a plurality of model-building parameters, estimating values for at least some of the model-building parameters, and applying the support vector machine model using the stored data to generate a data mining output.
摘要:
An implementation of SVM functionality improves efficiency, time consumption, and data security, reduces the parameter tuning challenges presented to the inexperienced user, and reduces the computational costs of building SVM models. A system for support vector machine processing comprises data stored in the system, a client application programming interface operable to provide an interface to client software, a build unit operable to build a support vector machine model on at least a portion of the data stored in the system, the portion of the data selected using a stratified sampling method with respect to a target distribution, an apply unit operable to apply the support vector machine model using the data stored in the system.
摘要:
An implementation of SVM functionality improves efficiency, time consumption, and data security, reduces the parameter tuning challenges presented to the inexperienced user, and reduces the computational costs of building SVM models. A system for support vector machine processing comprises data stored in the system, a client application programming interface operable to provide an interface to client software, a build unit operable to build a support vector machine model on at least a portion of the data stored in the system, based on a plurality of model-building parameters, a parameter estimation unit operable to estimate values for at least some of the model-building parameters, and an apply unit operable to apply the support vector machine model using the data stored in the system.
摘要:
A database management provides the capability to perform cluster analysis and provides improved performance in model building and data mining, good integration with the various databases throughout the enterprise, and flexible specification and adjustment of the models being built, but which provides data mining functionality that is accessible to users having limited data mining expertise and which provides reductions in development times and costs for data mining projects. The database management system for in-database clustering comprises a first data table and a second data table, each data table including a plurality of rows of data, means for building an enhanced K-means clustering model using the first data table, and means for applying the enhanced K-means clustering model using the second data table to generate apply output data.
摘要:
A system, method, and computer program product for in-database clustering provides the capability to perform cluster analysis and provides improved performance in model building and data mining, good integration with the various databases throughout the enterprise, and flexible specification and adjustment of the models being built, but which provides data mining functionality that is accessible to users having limited data mining expertise and which provides reductions in development times and costs for data mining projects. A database management system for in-database clustering, comprises a first data table and a second data table, each data table including a plurality of rows of data, means for building a clustering model using the first data table, and means for applying the clustering model using the second data table to generate apply output data.
摘要:
A database management system provides the capability to perform cluster analysis and provides improved performance in model building and data mining, good integration with the various databases throughout the enterprise, and flexible specification and adjustment of the models being built, but which provides data mining functionality that is accessible to users having limited data mining expertise and which provides reductions in development times and costs for data mining projects. The database management system for in-database clustering comprises a first data table and a second data table, each data table including a plurality of rows of data, means for building an Orthogonal Partitioning Clustering model using the first data table, and means for applying the Orthogonal Partitioning Clustering model using the second data table to generate apply output data.
摘要:
A database management system provides the capability to perform cluster analysis and provides improved performance in model building and data mining, good integration with the various databases throughout the enterprise, and flexible specification and adjustment of the models being built, but which provides data mining functionality that is accessible to users having limited data mining expertise and which provides reductions in development times and costs for data mining projects. The database management system for in-database clustering comprises a first data table and a second data table, each data table including a plurality of rows of data, means for building a clustering model using the first data table, means for building a rule-based model using the clustering model, and means for applying the rule-based model using the second data table to generate apply output data.
摘要:
A database management system provides the capability to perform cluster analysis and provides improved performance in model building and data mining, good integration with the various databases throughout the enterprise, and flexible specification and adjustment of the models being built, but which provides data mining functionality that is accessible to users having limited data mining expertise and which provides reductions in development times and costs for data mining projects. A database management system for in-database clustering comprises a first data table and a second data table, each data table including a plurality of rows of data, means for building a clustering model using the first data table using a portion of the first data table, wherein the portion of the first data table is selected by partitioning, density summarization, or active sampling of the first data table, and means for applying the clustering model using the second data table to generate apply output data.