Consistent histogram maintenance using query feedback
    1.
    发明授权
    Consistent histogram maintenance using query feedback 失效
    使用查询反馈进行一致的直方图维护

    公开(公告)号:US07512574B2

    公开(公告)日:2009-03-31

    申请号:US11239044

    申请日:2005-09-30

    CPC分类号: G06F17/30536

    摘要: A novel method is employed for collecting optimizer statistics for optimizing database queries by gathering feedback from the query execution engine about the observed cardinality of predicates and constructing and maintaining multidimensional histograms. This makes use of the correlation between data columns without employing an inefficient data scan. The maximum entropy principle is used to approximate the true data distribution by a histogram distribution that is as “simple” as possible while being consistent with the observed predicate cardinalities. Changes in the underlying data are readily adapted to, automatically detecting and eliminating inconsistent feedback information in an efficient manner. The size of the histogram is controlled by retaining only the most “important” feedback.

    摘要翻译: 采用一种新颖的方法来收集优化器统计数据,以优化数据库查询,方法是从查询执行引擎收集有关观察到的谓词的基数并构建和维护多维直方图的反馈。 这使得利用数据列之间的相关性而不采用低效的数据扫描。 最大熵原理用于通过尽可能“简单”的直方图分布近似真实数据分布,同时与观察到的谓词基数一致。 底层数据的变化很容易适应于以有效的方式自动检测和消除不一致的反馈信息。 通过仅保留最重要的反馈来控制直方图的大小。

    Consistent and unbiased cardinality estimation for complex queries with conjuncts of predicates
    2.
    发明授权
    Consistent and unbiased cardinality estimation for complex queries with conjuncts of predicates 有权
    具有谓词结合的复杂查询的一致且无偏差的基数估计

    公开(公告)号:US07512629B2

    公开(公告)日:2009-03-31

    申请号:US11457418

    申请日:2006-07-13

    IPC分类号: G06F7/00

    摘要: The present invention provides a method of selectivity estimation in which preprocessing steps improve the feasibility and efficiency of the estimation. The preprocessing steps are partitioning (to make iterative scaling estimation terminate in a reasonable time for even large sets of predicates), forced partitioning (to enable partitioning in case there are no “natural” partitions, by finding the subsets of predicates to create partitions that least impact the overall solution); inconsistency resolution (in order to ensure that there always is a correct and feasible solution), and implied zero elimination (to ensure convergence of the iterative scaling computation under all circumstances). All of these preprocessing steps make a maximum entropy method of selectivity estimation produce a correct cardinality model, for any kind of query with conjuncts of predicates. In addition, the preprocessing steps can also be used in conjunction with prior art methods for building a cardinality model.

    摘要翻译: 本发明提供了一种选择性估计方法,其中预处理步骤提高了估计的可行性和效率。 预处理步骤是分区(使迭代缩放估计在甚至大量谓词的合理时间内终止),强制分区(在没有“自然”分区的情况下启用分区),通过查找谓词子集来创建分区 对整体解决方案影响最小); 不一致性解决(为了确保总是有正确可行的解决方案),并暗示零消除(以确保在任何情况下迭代缩放计算的收敛)。 所有这些预处理步骤使得选择性估计的最大熵方法产生正确的基数模型,用于具有谓词结合的任何类型的查询。 此外,预处理步骤还可以与用于构建基数模型的现有技术方法结合使用。

    MANAGING UNCERTAIN DATA USING MONTE CARLO TECHNIQUES
    7.
    发明申请
    MANAGING UNCERTAIN DATA USING MONTE CARLO TECHNIQUES 有权
    使用蒙特卡罗技术管理不确定的数据

    公开(公告)号:US20100312775A1

    公开(公告)日:2010-12-09

    申请号:US12477856

    申请日:2009-06-03

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30536

    摘要: According to one embodiment of the present invention, a method for managing uncertain data is provided. The method includes specifying data uncertainty using at least one variable generation (VG) function, wherein the VG function generates pseudorandom samples of uncertain data values. A random database based on the VG function is specified. and multiple Monte Carlo instantiations of the random database are generated. Using a Monte Carlo method, a query is repeatedly executed over the multiple Monte Carlo instantiations to output a Monte Carlo method result and associated query-results. The Monte Carlo method result may then be used to estimate statistical properties of a probability distribution of the query-result.

    摘要翻译: 根据本发明的一个实施例,提供了一种用于管理不确定数据的方法。 该方法包括使用至少一个可变生成(VG)函数来指定数据不确定性,其中VG功能产生不确定数据值的伪随机样本。 指定了基于VG功能的随机数据库。 并生成随机数据库的多个蒙特卡罗实例。 使用蒙特卡罗方法,通过多个蒙特卡洛实例重复执行查询,以输出蒙特卡罗方法结果和关联的查询结果。 然后可以使用蒙特卡罗方法结果来估计查询结果的概率分布的统计特性。

    Systems and methods for large-scale randomized optimization for problems with decomposable loss functions
    9.
    发明授权
    Systems and methods for large-scale randomized optimization for problems with decomposable loss functions 有权
    用于分解损失函数问题的大规模随机优化的系统和方法

    公开(公告)号:US08983879B2

    公开(公告)日:2015-03-17

    申请号:US13595618

    申请日:2012-08-27

    摘要: Systems and methods directed toward processing optimization problems using loss functions, wherein a loss function is decomposed into at least one stratum loss function, a loss is decreased for each stratum loss function to a predefined stratum loss threshold individually using gradient descent, and the overall loss is decreased to a predefined threshold for the loss function by appropriately ordering the processing of the strata and spending appropriate processing time in each stratum. Other embodiments and aspects are also described herein.

    摘要翻译: 针对使用损失函数来处理优化问题的系统和方法,其中损失函数被分解成至少一个层损失函数,每个层损失函数的损失都减少到单独使用梯度下降的预定义层损失阈值,并且总体损耗 通过适当地排序层的处理并在每个层中消耗适当的处理时间来减少到损失函数的预定阈值。 本文还描述了其它实施例和方面。

    SYSTEMS AND METHODS FOR LARGE-SCALE RANDOMIZED OPTIMIZATION FOR PROBLEMS WITH DECOMPOSABLE LOSS FUNCTIONS
    10.
    发明申请
    SYSTEMS AND METHODS FOR LARGE-SCALE RANDOMIZED OPTIMIZATION FOR PROBLEMS WITH DECOMPOSABLE LOSS FUNCTIONS 审中-公开
    用于具有可分解损失函数的问题的大规模随机优化的系统和方法

    公开(公告)号:US20120331025A1

    公开(公告)日:2012-12-27

    申请号:US13595618

    申请日:2012-08-27

    IPC分类号: G06F7/38

    摘要: Systems and methods directed toward processing optimization problems using loss functions, wherein a loss function is decomposed into at least one stratum loss function, a loss is decreased for each stratum loss function to a predefined stratum loss threshold individually using gradient descent, and the overall loss is decreased to a predefined threshold for the loss function by appropriately ordering the processing of the strata and spending appropriate processing time in each stratum. Other embodiments and aspects are also described herein.

    摘要翻译: 针对使用损失函数来处理优化问题的系统和方法,其中损失函数被分解成至少一个层损失函数,每个层损失函数的损失都减少到单独使用梯度下降的预定义层损失阈值,并且总体损耗 通过适当地排序层的处理并在每个层中消耗适当的处理时间来减少到损失函数的预定阈值。 本文还描述了其它实施例和方面。