Consistent histogram maintenance using query feedback
    1.
    发明授权
    Consistent histogram maintenance using query feedback 失效
    使用查询反馈进行一致的直方图维护

    公开(公告)号:US07512574B2

    公开(公告)日:2009-03-31

    申请号:US11239044

    申请日:2005-09-30

    CPC分类号: G06F17/30536

    摘要: A novel method is employed for collecting optimizer statistics for optimizing database queries by gathering feedback from the query execution engine about the observed cardinality of predicates and constructing and maintaining multidimensional histograms. This makes use of the correlation between data columns without employing an inefficient data scan. The maximum entropy principle is used to approximate the true data distribution by a histogram distribution that is as “simple” as possible while being consistent with the observed predicate cardinalities. Changes in the underlying data are readily adapted to, automatically detecting and eliminating inconsistent feedback information in an efficient manner. The size of the histogram is controlled by retaining only the most “important” feedback.

    摘要翻译: 采用一种新颖的方法来收集优化器统计数据,以优化数据库查询,方法是从查询执行引擎收集有关观察到的谓词的基数并构建和维护多维直方图的反馈。 这使得利用数据列之间的相关性而不采用低效的数据扫描。 最大熵原理用于通过尽可能“简单”的直方图分布近似真实数据分布,同时与观察到的谓词基数一致。 底层数据的变化很容易适应于以有效的方式自动检测和消除不一致的反馈信息。 通过仅保留最重要的反馈来控制直方图的大小。

    Consistent and unbiased cardinality estimation for complex queries with conjuncts of predicates
    2.
    发明授权
    Consistent and unbiased cardinality estimation for complex queries with conjuncts of predicates 有权
    具有谓词结合的复杂查询的一致且无偏差的基数估计

    公开(公告)号:US07512629B2

    公开(公告)日:2009-03-31

    申请号:US11457418

    申请日:2006-07-13

    IPC分类号: G06F7/00

    摘要: The present invention provides a method of selectivity estimation in which preprocessing steps improve the feasibility and efficiency of the estimation. The preprocessing steps are partitioning (to make iterative scaling estimation terminate in a reasonable time for even large sets of predicates), forced partitioning (to enable partitioning in case there are no “natural” partitions, by finding the subsets of predicates to create partitions that least impact the overall solution); inconsistency resolution (in order to ensure that there always is a correct and feasible solution), and implied zero elimination (to ensure convergence of the iterative scaling computation under all circumstances). All of these preprocessing steps make a maximum entropy method of selectivity estimation produce a correct cardinality model, for any kind of query with conjuncts of predicates. In addition, the preprocessing steps can also be used in conjunction with prior art methods for building a cardinality model.

    摘要翻译: 本发明提供了一种选择性估计方法,其中预处理步骤提高了估计的可行性和效率。 预处理步骤是分区(使迭代缩放估计在甚至大量谓词的合理时间内终止),强制分区(在没有“自然”分区的情况下启用分区),通过查找谓词子集来创建分区 对整体解决方案影响最小); 不一致性解决(为了确保总是有正确可行的解决方案),并暗示零消除(以确保在任何情况下迭代缩放计算的收敛)。 所有这些预处理步骤使得选择性估计的最大熵方法产生正确的基数模型,用于具有谓词结合的任何类型的查询。 此外,预处理步骤还可以与用于构建基数模型的现有技术方法结合使用。

    AUTOMATICALLY AND ADAPTIVELY DETERMINING EXECUTION PLANS FOR QUERIES WITH PARAMETER MARKERS
    3.
    发明申请
    AUTOMATICALLY AND ADAPTIVELY DETERMINING EXECUTION PLANS FOR QUERIES WITH PARAMETER MARKERS 失效
    自动和自适应地确定具有参数标记的查询的执行计划

    公开(公告)号:US20080222093A1

    公开(公告)日:2008-09-11

    申请号:US12125221

    申请日:2008-05-22

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30469

    摘要: A method and system for automatically and adaptively determining query execution plans for parametric queries. A first classifier trained by an initial set of training points is generated. A query workload and/or database statistics are dynamically updated. A new set of training points is collected off-line. Using the new set of training points, the first classifier is modified into a second classifier. A database query is received at a runtime subsequent to the off-line phase. The query includes predicates having parameter markers bound to actual values. The predicates are associated with selectivities. A mapping of the selectivities into a plan determines the query execution plan. The determined query execution plan is included in an augmented set of training points, where the augmented set includes the initial set and the new set.

    摘要翻译: 一种用于自动和自适应地确定参数查询的查询执行计划的方法和系统。 产生由初始训练点训练的第一分类器。 动态更新查询工作负载和/或数据库统计信息。 离线收集了一套新的培训点。 使用新的一组训练点,第一个分类器被修改为第二个分类器。 在离线阶段之后的运行时间接收数据库查询。 该查询包括具有绑定到实际值的参数标记的谓词。 谓词与选择性相关联。 将选择性映射到计划中确定查询执行计划。 确定的查询执行计划被包括在增强的训练点集合中,其中增强集合包括初始集合和新集合。

    Automatically and adaptively determining execution plans for queries with parameter markers
    6.
    发明授权
    Automatically and adaptively determining execution plans for queries with parameter markers 失效
    自动和自适应地确定具有参数标记的查询的执行计划

    公开(公告)号:US07958113B2

    公开(公告)日:2011-06-07

    申请号:US12125221

    申请日:2008-05-22

    IPC分类号: G06F7/00 G06F17/30 G06F15/16

    CPC分类号: G06F17/30469

    摘要: A method and system for automatically and adaptively determining query execution plans for parametric queries. A first classifier trained by an initial set of training points is generated. A query workload and/or database statistics are dynamically updated. A new set of training points is collected off-line. Using the new set of training points, the first classifier is modified into a second classifier. A database query is received at a runtime subsequent to the off-line phase. The query includes predicates having parameter markers bound to actual values. The predicates are associated with selectivities. A mapping of the selectivities into a plan determines the query execution plan. The determined query execution plan is included in an augmented set of training points, where the augmented set includes the initial set and the new set.

    摘要翻译: 一种用于自动和自适应地确定参数查询的查询执行计划的方法和系统。 产生由初始训练点训练的第一分类器。 动态更新查询工作负载和/或数据库统计信息。 离线收集了一套新的培训点。 使用新的一组训练点,第一个分类器被修改为第二个分类器。 在离线阶段之后的运行时间接收数据库查询。 该查询包括具有绑定到实际值的参数标记的谓词。 谓词与选择性相关联。 将选择性映射到计划中确定查询执行计划。 确定的查询执行计划被包括在增强的训练点集合中,其中增强集合包括初始集合和新集合。

    AUTOMATICALLY AND ADAPTIVELY DETERMINING EXECUTION PLANS FOR QUERIES WITH PARAMETER MARKERS
    8.
    发明申请
    AUTOMATICALLY AND ADAPTIVELY DETERMINING EXECUTION PLANS FOR QUERIES WITH PARAMETER MARKERS 审中-公开
    自动和自适应地确定具有参数标记的查询的执行计划

    公开(公告)号:US20080195577A1

    公开(公告)日:2008-08-14

    申请号:US11673091

    申请日:2007-02-09

    IPC分类号: G06F17/30

    CPC分类号: G06F16/24545

    摘要: A method for automatically and adaptively determining query execution plans for parametric queries. A first classifier trained by an initial set of training points is generated using a set of random decision trees (RDTs). A query workload and/or database statistics are dynamically updated. A new set of training points collected off-line is used to modify the first classifier into a second classifier. A database query is received at a runtime subsequent to the off line phase. The query includes predicates having parameter markers bound to actual values. The predicates are associated with selectivities. The query execution plan is determined by identifying an optimal average of posterior probabilities obtained across a set of RDTs and mapping the selectivities to a plan. The determined query execution plan is included in an augmented set of training points that includes the initial set and the new set.

    摘要翻译: 一种用于自动和自适应地确定参数查询的查询执行计划的方法。 使用一组随机决策树(RDT)生成由初始训练点组训练的第一分类器。 动态更新查询工作负载和/或数据库统计信息。 离线收集的一组新的训练点用于将第一个分类器修改为第二个分类器。 在离线阶段之后的运行时间接收数据库查询。 该查询包括具有绑定到实际值的参数标记的谓词。 谓词与选择性相关联。 查询执行计划通过确定通过一组RDT获得的后验概率的最优平均值并将选择性映射到计划来确定。 确定的查询执行计划被包括在包括初始集合和新集合的增强的训练点集合中。