摘要:
The subject disclosure pertains to extensible data mining systems, means, and methodologies. For example, a data mining system is disclosed that supports plug-in or integration of non-native mining algorithms, perhaps provided by third parties, such that they function the same as built-in algorithms. Furthermore, non-native data mining viewers may also be seamlessly integrated into the system for displaying the results of one or more algorithms including those provided by third parties as well as those built-in. Still further yet, support is provided for extending data mining languages to include user-defined functions (UDFs).
摘要:
A system that facilitates data mining comprises a reception component that receives command(s) in a declarative language that relate to utilizing an output of a first data mining model as an input to a second data mining model. An implementation component analyzes the received command(s) and implements the command(s) with respect to the first and second data mining models. In another aspect of the subject invention, the reception component can receive further command(s) in a declarative language with respect to causing one or more of the first and second data mining models to output a prediction, the prediction desirably generated without prediction input, the implementation component causes the one or more of the first and second data mining models to output the prediction.
摘要:
A standard mechanism for directly accessing unstructured data types (e.g., image, audio, video, gene sequencing and text data) in accordance with data mining operations is provided. The subject innovation can enable access to unstructured data directly from within the data mining engine or tool. Accordingly, the innovation enables multiple vendors to provide algorithms for mining unstructured data on a data mining platform (e.g., an SQL-brand server), thereby increasing adoption. As well, the subject innovation allows users to directly mine unstructured data that is not fixed-length, without pre-processing and tokenizing the data external to the data mining engine. In accordance therewith, the innovation can provide a mechanism to expand declarative language content types to include an “unstructured” data type thereby enabling a user and/or application to affirmatively designate mining data as an unstructured type.
摘要:
The subject invention relates to systems and methods to extend the capabilities of declarative data modeling languages. In one aspect, a declarative data modeling language system is provided. The system includes a data modeling language component that generates one or more data mining models to extract predictive information from local or remote databases. A language extension component facilitates modeling capability in the data modeling language by providing a data sequence model or a time series model within the data modeling language to support various data mining applications.
摘要:
A language schema that integrates multidimensional extensions (e.g., MDX) and data mining extensions (e.g., DMX) for performing data mining operations on data residing in OLAP cubes. The schema provides that the can not only be a relational query, rather a multidimensional query formed using MDX, for example. The operations of model creation, training and prediction are described.
摘要:
The subject invention relates to systems and methods to extend the capabilities of declarative data modeling languages. In one aspect, a declarative data modeling language system is provided. The system includes a data modeling language component that generates one or more data mining models to extract predictive information from local or remote databases. A language extension component facilitates modeling capability in the data modeling language by providing a data sequence model or a time series model within the data modeling language to support various data mining applications.
摘要:
The subject disclosure pertains to extensible data mining systems, means, and methodologies. For example, a data mining system is disclosed that supports plug-in or integration of non-native mining algorithms, perhaps provided by third parties, such that they function the same as built-in algorithms. Furthermore, non-native data mining viewers may also be seamlessly integrated into the system for displaying the results of one or more algorithms including those provided by third parties as well as those built-in. Still further yet, support is provided for extending data mining languages to include user-defined functions (UDFs).
摘要:
A system that facilitates data mining comprises a reception component that receives command(s) in a declarative language that relate to utilizing an output of a first data mining model as an input to a second data mining model. An implementation component analyzes the received command(s) and implements the command(s) with respect to the first and second data mining models. In another aspect of the subject invention, the reception component can receive further command(s) in a declarative language with respect to causing one or more of the first and second data mining models to output a prediction, the prediction desirably generated without prediction input, the implementation component causes the one or more of the first and second data mining models to output the prediction.
摘要:
A language schema that integrates multidimensional extensions (e.g., MDX) and data mining extensions (e.g., DMX) for performing data mining operations on data residing in OLAP cubes. The schema provides that the can not only be a relational query, rather a multidimensional query formed using MDX, for example. The operations of model creation, training and prediction are described.
摘要:
A method for performing data mining is provided. The method includes selecting at least one data source of unstructured text. Additionally, a transformation is selected to identify a list of terms in the unstructured text. A run-time path is established to connect the data source to the transformation to load the list of terms identified into a destination database.