Systems and methods that facilitate data mining
    1.
    发明授权
    Systems and methods that facilitate data mining 失效
    促进数据挖掘的系统和方法

    公开(公告)号:US07398268B2

    公开(公告)日:2008-07-08

    申请号:US11049031

    申请日:2005-02-02

    IPC分类号: G06F17/30

    摘要: A system that facilitates data mining comprises a reception component that receives command(s) in a declarative language that relate to utilizing an output of a first data mining model as an input to a second data mining model. An implementation component analyzes the received command(s) and implements the command(s) with respect to the first and second data mining models. In another aspect of the subject invention, the reception component can receive further command(s) in a declarative language with respect to causing one or more of the first and second data mining models to output a prediction, the prediction desirably generated without prediction input, the implementation component causes the one or more of the first and second data mining models to output the prediction.

    摘要翻译: 便于数据挖掘的系统包括:接收组件,其以声明性语言接收与将第一数据挖掘模型的输出利用为第二数据挖掘模型的输入相关的命令。 实现组件分析所接收的命令并且针对第一和第二数据挖掘模型实现命令。 在本发明的另一方面,接收组件可以以声明性语言接收另外的命令,以使得第一和第二数据挖掘模型中的一个或多个输出预测,期望地产生而不具有预测输入的预测, 实现组件使第一和第二数据挖掘模型中的一个或多个输出预测。

    Extensible data mining framework
    2.
    发明授权
    Extensible data mining framework 有权
    可扩展数据挖掘框架

    公开(公告)号:US07383234B2

    公开(公告)日:2008-06-03

    申请号:US11157602

    申请日:2005-06-21

    IPC分类号: G06N5/00

    CPC分类号: G06F17/30539 G06F2216/03

    摘要: The subject disclosure pertains to extensible data mining systems, means, and methodologies. For example, a data mining system is disclosed that supports plug-in or integration of non-native mining algorithms, perhaps provided by third parties, such that they function the same as built-in algorithms. Furthermore, non-native data mining viewers may also be seamlessly integrated into the system for displaying the results of one or more algorithms including those provided by third parties as well as those built-in. Still further yet, support is provided for extending data mining languages to include user-defined functions (UDFs).

    摘要翻译: 主题公开涉及可扩展数据挖掘系统,手段和方法。 例如,公开了一种数据挖掘系统,其支持可能由第三方提供的非本地挖掘算法的插件或集成,使得它们与内置算法相同。 此外,非本地数据挖掘查看器还可以无缝地集成到系统中,用于显示包括由第三方提供的那些算法的一个或多个算法的结果以及内置的算法。 此外,还提供了用于扩展数据挖掘语言以包括用户定义的功能(UDF)的支持。

    Modeling sequence and time series data in predictive analytics
    3.
    发明授权
    Modeling sequence and time series data in predictive analytics 有权
    预测分析中的建模序列和时间序列数据

    公开(公告)号:US07747641B2

    公开(公告)日:2010-06-29

    申请号:US11116832

    申请日:2005-04-28

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30539 G06F17/30548

    摘要: The subject invention relates to systems and methods to extend the capabilities of declarative data modeling languages. In one aspect, a declarative data modeling language system is provided. The system includes a data modeling language component that generates one or more data mining models to extract predictive information from local or remote databases. A language extension component facilitates modeling capability in the data modeling language by providing a data sequence model or a time series model within the data modeling language to support various data mining applications.

    摘要翻译: 本发明涉及扩展声明式数据建模语言能力的系统和方法。 在一个方面,提供了一种声明式数据建模语言系统。 该系统包括数据建模语言组件,其生成一个或多个数据挖掘模型以从本地或远程数据库提取预测信息。 语言扩展组件通过在数据建模语言中提供数据序列模型或时间序列模型来促进数据建模语言中的建模能力,以支持各种数据挖掘应用程序。

    Drill-through queries from data mining model content
    5.
    发明授权
    Drill-through queries from data mining model content 失效
    来自数据挖掘模型内容的钻取查询

    公开(公告)号:US07188090B2

    公开(公告)日:2007-03-06

    申请号:US10611119

    申请日:2003-06-30

    IPC分类号: G06F17/00 G06F17/20

    摘要: A drill-through feature is provided which provides a universal drill-through to mining model source data from a trained mining model. In order for a user or application to obtain model content information on a given node of a model, a universal function is provided whereby the user specifies the node for a model and data set, and the cases underlying that node for that model and data set are returned. A sampling of underlying cases may be provided, where only a sampling of the cases represented in the node is requested.

    摘要翻译: 提供钻取功能,其提供了从受过训练的挖掘模型挖掘模型来源数据的通用钻取。 为了使用户或应用程序获得模型的给定节点上的模型内容信息,提供通用功能,借此用户为模型和数据集指定节点,并为该模型和数据集指定该节点的情况 被归还。 可以提供对基础案例的抽样,其中仅请求节点中表示的案例的抽样。

    System and method for feature selection in decision trees
    7.
    发明授权
    System and method for feature selection in decision trees 失效
    决策树中特征选择的系统和方法

    公开(公告)号:US07251639B2

    公开(公告)日:2007-07-31

    申请号:US10185663

    申请日:2002-06-27

    IPC分类号: G06F17/00

    CPC分类号: G06Q10/10

    摘要: Selection of certain attributes as output and input attributes is provided so a decision tree may be created more efficiently. For each possible output attribute an interestingness score is calculated. This interestingness score is based on entropy of the output attribute and a desirable entropy constant. The attributes with the highest interestingness score are used as output attributes in the creation of the decision tree. Score gains for the input attribute over the output attributes are calculated using a conventional scoring algorithm. The sum of the score gains over all output attributes for each input attribute is calculated. The attributes with the highest score gain sums are used as input attributes in the creation of the decision tree.

    摘要翻译: 提供某些属性的选择作为输出和输入属性,因此可以更有效地创建决策树。 对于每个可能的输出属性,计算一个有趣的得分。 这个有趣的得分是基于输出属性的熵和期望的熵常数。 在创建决策树时,具有最高趣味度得分的属性被用作输出属性。 使用传统的评分算法计算输出属性上的输入属性的分数增益。 计算每个输入属性的所有输出属性的得分增益的总和。 具有最高分数增益和的属性在创建决策树时用作输入属性。

    Systems and methods for mining model accuracy display for multiple state prediction
    8.
    发明授权
    Systems and methods for mining model accuracy display for multiple state prediction 有权
    用于多种状态预测的挖掘模型精度显示的系统和方法

    公开(公告)号:US07379843B2

    公开(公告)日:2008-05-27

    申请号:US10932583

    申请日:2004-09-01

    IPC分类号: G06F15/00

    CPC分类号: G06N7/00

    摘要: Systems and methods are provided for producing a mining model accuracy display that depicts the model's accuracy at predicting a state for a multiple-state variable. The model predicts a state and provides an associated probability for each case. Points are graphed such that one coordinate of the data point corresponds to a number N of cases and the other coordinate corresponds to the number of correct predictions made in the top N cases by probability.

    摘要翻译: 提供了系统和方法来产生挖掘模型精度显示,其描绘了模型在预测多状态变量的状态时的准确性。 该模型预测状态并为每种情况提供相关联的概率。 点被绘制为使得数据点的一个坐标对应于N个情况,另一个坐标对应于通过概率在前N个情况中做出的正确预测的数量。

    System and method for mining model accuracy display
    9.
    发明申请
    System and method for mining model accuracy display 审中-公开
    挖掘模型精度显示的系统和方法

    公开(公告)号:US20070010966A1

    公开(公告)日:2007-01-11

    申请号:US11519317

    申请日:2006-09-11

    IPC分类号: G06F17/18 G06F19/00

    CPC分类号: G06F17/18 G06F16/2465

    摘要: Systems and methods are provided for producing displays of the accuracy of data mining or statistical models that produce associative predictions. For all cases in a testing data set, the model makes predictions and provides associated probabilities. The cases are sorted by their probability of making accurate predictions and a graph is made of the accuracy of the model over various subsets containing the highest probability cases as evaluated by the model. Where a number of probabilities are presented for the predictions in a basket of predictions, those probabilities are combined to yield a probability score for the entire basket. Additionally, the accuracy of a model over different basket sizes may be graphed. The accuracy graph may also be produced for any models making a prediction, by graphing the probability of making accurate predictions and a graph made of the accuracy of the model over various subsets of the data containing the highest probability cases.

    摘要翻译: 提供系统和方法用于产生数据挖掘的准确性的显示或产生关联预测的统计模型。 对于测试数据集中的所有情况,模型进行预测并提供相关概率。 这些案例按照准确预测的概率进行排序,并且通过模型评估,对包含最高概率案例的各种子集进行模型的精度图。 在对一篮子预测中的预测提出若干概率的情况下,将这些概率组合起来以产生整个篮子的概率得分。 此外,可以绘制不同篮子尺寸的模型的精度。 也可以通过绘制准确预测的概率和通过包含最高概率情况的数据的各种子集对模型的精度进行绘制的图形来产生准确度图。

    Systems and methods for mining model accuracy display for multiple state prediction
    10.
    发明申请
    Systems and methods for mining model accuracy display for multiple state prediction 有权
    用于多种状态预测的挖掘模型精度显示的系统和方法

    公开(公告)号:US20050027478A1

    公开(公告)日:2005-02-03

    申请号:US10932583

    申请日:2004-09-01

    CPC分类号: G06N7/00

    摘要: Systems and methods are provided for producing a mining model accuracy display that depicts the model's accuracy at predicting a state for a multiple-state variable. The model predicts a state and provides an associated probability for each case. Points are graphed such that one coordinate of the data point corresponds to a number N of cases and the other coordinate corresponds to the number of correct predictions made in the top N cases by probability.

    摘要翻译: 提供了系统和方法来产生挖掘模型精度显示,其描绘了模型在预测多状态变量的状态时的准确性。 该模型预测状态并为每种情况提供相关联的概率。 点被绘制为使得数据点的一个坐标对应于N个情况,另一个坐标对应于通过概率在前N个情况中做出的正确预测的数量。