System and method for building decision trees in a database
    1.
    发明授权
    System and method for building decision trees in a database 有权
    在数据库中构建决策树的系统和方法

    公开(公告)号:US09135309B2

    公开(公告)日:2015-09-15

    申请号:US13300030

    申请日:2011-11-18

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30539

    摘要: A computer-implemented method of creating a data mining model in a database management system comprises accepting a database language statement at the database management system, the database language statement indicating a dataset and a data mining model to be created from the dataset, and creating, in the database management system, the indicated data mining model using the indicated dataset, wherein creation and application of the data mining model does not require moving data to a separate data mining engine.

    摘要翻译: 在数据库管理系统中创建数据挖掘模型的计算机实现的方法包括在数据库管理系统上接受数据库语言语句,指示要从数据集创建的数据集和数据挖掘模型的数据库语言语句, 在数据库管理系统中,使用所指示的数据集的指示数据挖掘模型,其中数据挖掘模型的创建和应用不需要将数据移动到单独的数据挖掘引擎。

    System and method for building decision trees in a database
    2.
    发明授权
    System and method for building decision trees in a database 有权
    在数据库中构建决策树的系统和方法

    公开(公告)号:US08065326B2

    公开(公告)日:2011-11-22

    申请号:US11344112

    申请日:2006-02-01

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30539

    摘要: Decision trees are efficiently represented in a relational database. A computer-implemented method of representing a decision tree model in relational form comprises providing a directed acyclic graph comprising a plurality of nodes and a plurality of links, each link connecting a plurality of nodes, encoding a tree structure by including in each node a parent-child relationship of the node with other nodes, encoding in each node information relating to a split represented by the node, the split information including a splitting predictor and a split value, and encoding in each node a target histogram.

    摘要翻译: 决策树在关系数据库中有效地表示。 以关系形式表示决策树模型的计算机实现的方法包括提供包括多个节点和多个链接的有向无环图,每个链接连接多个节点,通过在每个节点中包括父节点来编码树结构 - 节点与其他节点的关系,在每个节点中对由节点表示的分裂相关的信息进行编码,分割信息包括分割预测器和分割值,以及在每个节点中对目标直方图进行编码。

    System and method for building decision tree classifiers using bitmap techniques
    3.
    发明授权
    System and method for building decision tree classifiers using bitmap techniques 有权
    使用位图技术构建决策树分类器的系统和方法

    公开(公告)号:US07571159B2

    公开(公告)日:2009-08-04

    申请号:US11344193

    申请日:2006-02-01

    IPC分类号: G06F7/00 G06F17/30 G06F17/00

    摘要: A method, system, and computer program product for counting predictor-target pairs for a decision tree model provides the capability to generate count tables that is quicker and more efficient than previous techniques. A method of counting predictor-target pairs for a decision tree model, the decision tree model based on data stored in a database, the data comprising a plurality of rows of data, at least one predictor and at least one target, comprises generating a bitmap for each split node of data stored in a database system by intersecting a parent node bitmap and a bitmap of a predictor that satisfies a condition of the node, intersecting each split node bitmap with each predictor bitmap and with each target bitmap to form intersected bitmaps, and counting bits of each intersected bitmap to generate a count of predictor-target pairs.

    摘要翻译: 用于计算决策树模型的预测器 - 目标对的方法,系统和计算机程序产品提供了生成比先前技术更快更有效的计数表的能力。 一种对决策树模型计算预测器 - 目标对的方法,基于存储在数据库中的数据的决策树模型,包括多行数据的数据,至少一个预测器和至少一个目标,包括生成位图 通过将父节点位图和满足该节点的条件的预测器的位图相交到数据库系统中存储的数据的每个分割节点,将每个分割节点位图与每个预测器位图相交,并与每个目标位图形成相交的位图, 并计数每个相交位图的位以产生预测器 - 目标对的计数。

    System and method for building decision trees in a database
    4.
    发明申请
    System and method for building decision trees in a database 有权
    在数据库中构建决策树的系统和方法

    公开(公告)号:US20070179966A1

    公开(公告)日:2007-08-02

    申请号:US11344112

    申请日:2006-02-01

    IPC分类号: G06F7/00

    CPC分类号: G06F17/30539

    摘要: Decision trees are efficiently represented in a relational database. A computer-implemented method of representing a decision tree model in relational form comprises providing a directed acyclic graph comprising a plurality of nodes and a plurality of links, each link connecting a plurality of nodes, encoding a tree structure by including in each node a parent-child relationship of the node with other nodes, encoding in each node information relating to a split represented by the node, the split information including a splitting predictor and a split value, and encoding in each node a target histogram.

    摘要翻译: 决策树在关系数据库中有效地表示。 以关系形式表示决策树模型的计算机实现的方法包括提供包括多个节点和多个链接的有向无环图,每个链接连接多个节点,通过在每个节点中包括父节点来编码树结构 - 节点与其他节点的关系,在每个节点中对由节点表示的分裂相关的信息进行编码,分割信息包括分割预测器和分割值,以及在每个节点中对目标直方图进行编码。

    System and method for building decision tree classifiers using bitmap techniques
    5.
    发明申请
    System and method for building decision tree classifiers using bitmap techniques 有权
    使用位图技术构建决策树分类器的系统和方法

    公开(公告)号:US20070192341A1

    公开(公告)日:2007-08-16

    申请号:US11344193

    申请日:2006-02-01

    IPC分类号: G06F7/00

    摘要: A method, system, and computer program product for counting predictor-target pairs for a decision tree model provides the capability to generate count tables that is quicker and more efficient than previous techniques. A method of counting predictor-target pairs for a decision tree model, the decision tree model based on data stored in a database, the data comprising a plurality of rows of data, at least one predictor and at least one target, comprises generating a bitmap for each split node of data stored in a database system by intersecting a parent node bitmap and a bitmap of a predictor that satisfies a condition of the node, intersecting each split node bitmap with each predictor bitmap and with each target bitmap to form intersected bitmaps, and counting bits of each intersected bitmap to generate a count of predictor-target pairs.

    摘要翻译: 用于计算决策树模型的预测器 - 目标对的方法,系统和计算机程序产品提供了生成比先前技术更快更有效的计数表的能力。 一种对决策树模型计算预测器 - 目标对的方法,基于存储在数据库中的数据的决策树模型,包括多行数据的数据,至少一个预测器和至少一个目标,包括生成位图 通过将父节点位图和满足该节点的条件的预测器的位图相交到数据库系统中存储的数据的每个分割节点,将每个分割节点位图与每个预测器位图相交,并与每个目标位图形成相交的位图, 并计数每个相交位图的位以产生预测器 - 目标对的计数。

    Binning predictors using per-predictor trees and MDL pruning
    6.
    发明授权
    Binning predictors using per-predictor trees and MDL pruning 有权
    使用每预测树和MDL修剪的binning预测变量

    公开(公告)号:US08280915B2

    公开(公告)日:2012-10-02

    申请号:US11344185

    申请日:2006-02-01

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06K9/6282

    摘要: Binning of predictor values used for generating a data mining model provides useful reduction in memory footprint and computation during the computationally dominant decision tree build phase, but reduces the information loss of the model and reduces the introduction of false information artifacts. A method of binning data in a database for data mining modeling in a database system, the data stored in a database table in the database system, the data mining modeling having selected at least one predictor and one target for the data, the data including a plurality of values of the predictor and a plurality of values of the target, the method comprises constructing a binary tree for the predictor that splits the values of the predictor into a plurality of portions, pruning the binary tree, and defining as bins of the predictor leaves of the tree that remain after pruning, each leaf of the tree representing a portion of the values of the predictor.

    摘要翻译: 用于生成数据挖掘模型的预测值的分组在计算主导的决策树构建阶段提供了有用的减少内存占用和计算,但减少了模型的信息丢失并减少了虚假信息工件的引入。 一种在数据库中对数据进行数据挖掘建模的方法,数据库系统中存储的数据库中存储的数据,数据挖掘建模已经为数据选择了至少一个预测因子和一个目标,数据包括 所述预测器的多个值和所述目标的多个值,所述方法包括为所述预测器构建二叉树,所述预测器将所述预测器的值分割成多个部分,修剪所述二叉树,并且将所述二叉树定义为所述预测器 修剪后保留的树的叶子,树的每个叶表示预测值的一部分值。

    Binning predictors using per-predictor trees and MDL pruning
    7.
    发明申请
    Binning predictors using per-predictor trees and MDL pruning 有权
    使用每预测树和MDL修剪的binning预测变量

    公开(公告)号:US20070185896A1

    公开(公告)日:2007-08-09

    申请号:US11344185

    申请日:2006-02-01

    IPC分类号: G06F7/00

    CPC分类号: G06K9/6282

    摘要: Binning of predictor values used for generating a data mining model provides useful reduction in memory footprint and computation during the computationally dominant decision tree build phase, but reduces the information loss of the model and reduces the introduction of false information artifacts. A method of binning data in a database for data mining modeling in a database system, the data stored in a database table in the database system, the data mining modeling having selected at least one predictor and one target for the data, the data including a plurality of values of the predictor and a plurality of values of the target, the method comprises constructing a binary tree for the predictor that splits the values of the predictor into a plurality of portions, pruning the binary tree, and defining as bins of the predictor leaves of the tree that remain after pruning, each leaf of the tree representing a portion of the values of the predictor.

    摘要翻译: 用于生成数据挖掘模型的预测值的分组在计算主导的决策树构建阶段提供了有用的减少内存占用和计算,但减少了模型的信息丢失并减少了虚假信息工件的引入。 一种在数据库中对数据进行数据挖掘建模的方法,数据库系统中存储的数据库中存储的数据,数据挖掘建模已经为数据选择了至少一个预测因子和一个目标,数据包括 所述预测器的多个值和所述目标的多个值,所述方法包括为所述预测器构建二叉树,所述预测器将所述预测器的值分割成多个部分,修剪所述二叉树,并且将所述二叉树定义为所述预测器 修剪后保留的树的叶子,树的每个叶表示预测值的一部分值。

    Support vector machines in a relational database management system
    9.
    发明申请
    Support vector machines in a relational database management system 有权
    在关系数据库管理系统中支持向量机

    公开(公告)号:US20050050087A1

    公开(公告)日:2005-03-03

    申请号:US10927024

    申请日:2004-08-27

    IPC分类号: G06F17/00 G06F17/30

    摘要: An implementation of SVM functionality integrated into a relational database management system (RDBMS) improves efficiency, time consumption, and data security, reduces the parameter tuning challenges presented to the inexperienced user, and reduces the computational costs of building SVM models. A database management system comprises data stored in the database management system and a processing unit comprising a client application programming interface operable to provide an interface to client software, a build unit operable to build a support vector machine model on at least a portion of the data stored in the database management system, and an apply unit operable to apply the support vector machine model using the data stored in the database management system. The database management system may be a relational database management system.

    摘要翻译: 集成到关系数据库管理系统(RDBMS)中的SVM功能的实现提高了效率,时间消耗和数据安全性,减少了对经验不足的用户提出的参数调优挑战,并降低了构建SVM模型的计算成本。 数据库管理系统包括存储在数据库管理系统中的数据和处理单元,该处理单元包括可操作以向客户端软件提供接口的客户端应用编程接口,可操作以在数据的至少一部分上构建支持向量机模型的构建单元 存储在数据库管理系统中的应用单元,以及可以使用存储在数据库管理系统中的数据应用支持向量机模型的应用单元。 数据库管理系统可以是关系数据库管理系统。

    Support vector machines processing system
    10.
    发明申请
    Support vector machines processing system 有权
    支持向量机处理系统

    公开(公告)号:US20050049990A1

    公开(公告)日:2005-03-03

    申请号:US10927111

    申请日:2004-08-27

    摘要: An implementation of SVM functionality improves efficiency, time consumption, and data security, reduces the parameter tuning challenges presented to the inexperienced user, and reduces the computational costs of building SVM models. A system for support vector machine processing comprises data stored in the system, a client application programming interface operable to provide an interface to client software, a build unit operable to build a support vector machine model on at least a portion of the data stored in the system, based on a plurality of model-building parameters, a parameter estimation unit operable to estimate values for at least some of the model-building parameters, and an apply unit operable to apply the support vector machine model using the data stored in the system.

    摘要翻译: SVM功能的实现提高了效率,时间消耗和数据安全性,减少了对经验不足的用户提出的参数调优挑战,并降低了构建SVM模型的计算成本。 用于支持向量机处理的系统包括存储在系统中的数据,可操作以向客户端软件提供接口的客户端应用程序编程接口,可构建单元,用于在存储在所述系统中的数据的至少一部分上构建支持向量机模型 系统,基于多个模型构建参数,参数估计单元,其可操作以估计至少一些模型建立参数的值;以及应用单元,可操作以使用存储在系统中的数据应用支持向量机模型 。