JOIN-BASED CONTAINMENT FOR SET OPERATION-BASED SUBQUERY REMOVAL

    公开(公告)号:US20220309062A1

    公开(公告)日:2022-09-29

    申请号:US17213034

    申请日:2021-03-25

    Abstract: Techniques are described herein for subquery removal given two set operation-based subqueries in a query, where one subquery contains the result of the other. The described optimization technique of subquery removal is enabled by join and set operation-based containment of the set operation-based subqueries where semantic equivalence can be established for a given pair of set operation-based subqueries when some table(s)—with associated join condition(s), correlation condition(s), and/or filter predicate(s)—in one subquery are not considered. Subquery removal reduces multiple access to the same table and multiple evaluations of the same join conditions required to evaluate the query. When a subquery is removed from a disjunction, this may lead to other optimizations such as subquery unnesting, e.g., when the original query configuration would not permit query unnesting and the rewritten query (with one or more removed subqueries) permits unnesting.

    Optimized execution of queries involving early terminable database operators

    公开(公告)号:US10891271B2

    公开(公告)日:2021-01-12

    申请号:US15989560

    申请日:2018-05-25

    Abstract: According to embodiments, a multi-node database management system allows consumer processes (“consumers”) implementing a portion of a distributed data-combination operation to independently send a STOP notification to corresponding producer processes (“producers”). Upon a given consumer determining that the consumer requires no further information from corresponding producers, the consumer sends a STOP notification to the producers. When a given consumer sends out a STOP notification, the producers drop any data destined for the given consumer and also stops preparing data for and sending rows to the given consumer. Furthermore, once the producers receive STOP notifications from all of the consumers corresponding to the producers, the producers stop the current sub plan execution immediately without requiring completion of the sub plan. Thus, embodiments significantly improve the query execution performance by avoiding scanning and distributing data that is not needed for execution of the distributed operation.

    Enhancing Parallelism in Evaluation Ranking/Cumulative Window Functions
    3.
    发明申请
    Enhancing Parallelism in Evaluation Ranking/Cumulative Window Functions 有权
    评估排名/累积窗口函数增强并行度

    公开(公告)号:US20140214799A1

    公开(公告)日:2014-07-31

    申请号:US13754740

    申请日:2013-01-30

    CPC classification number: G06F17/30445

    Abstract: According to one aspect of the invention, for a database statement that specifies evaluating ranking or cumulative window functions, an execution strategy based on an extended data distribution key may be used for the database statement. In the execution strategy, each sort operator of multiple parallel processing sort operators computes locally evaluated results of a ranking or cumulative window function based on a subset of rows in all rows used to evaluate the database statement, and sends the first and last rows' locally evaluated results to a query coordinator. The query coordinator consolidates the locally evaluated results received from the multiple parallel processing sort operators and sends consolidated results to the sort operators based on their respective demographics. Each sort operator completes full evaluation of the ranking or cumulative window functions based at least in part on one or more of the consolidated results provided by the query coordinator.

    Abstract translation: 根据本发明的一个方面,对于指定评估排名或累积窗口函数的数据库语句,可以对数据库语句使用基于扩展数据分配密钥的执行策略。 在执行策略中,多个并行处理排序运算符的每个排序运算符基于用于评估数据库语句的所有行中的行的子集来计算排名或累积窗口函数的本地评估结果,并将第一行和最后一行“ 评估结果给查询协调器。 查询协调器整合从多个并行处理排序运算符接收的本地评估结果,并根据各自的人口统计特征将综合结果发送给排序运算符。 每个排序运算符至少部分地基于查询协调器提供的一个或多个综合结果完成对排名或累积窗函数的完全评估。

    Efficient execution of a sequence of SQL operations using runtime partition injection and iterative execution

    公开(公告)号:US11301468B2

    公开(公告)日:2022-04-12

    申请号:US16571006

    申请日:2019-09-13

    Abstract: Execution plans generated for multiple analytic queries incorporate two new kinds of plan operators, a partition creator and partition iterator. The partition creator and partition iterator operate as a pair. A partition creator operator creates partitions of rows and a partitioning descriptor describing the partitions created. A partition iterator iterates through the partitions based on the partitioning descriptor. For each partition, multiple analytic operators are executed serially, one after the other, on the same rows in the partition. According to an embodiment, partitioning is based on a common grouping or subgrouping of the multiple analytic functions or operators. Columns in the grouping or subgrouping may be ignored when executing each of the multiple analytic operators. Forming execution plans that include partition creator and partition iterator in this way is referred to herein as partitioning injection.

    Dynamic parallel aggregation with hybrid batch flushing
    5.
    发明授权
    Dynamic parallel aggregation with hybrid batch flushing 有权
    动态并行聚合与混合批量冲洗

    公开(公告)号:US09460154B2

    公开(公告)日:2016-10-04

    申请号:US13705004

    申请日:2012-12-04

    CPC classification number: G06F17/30489

    Abstract: A method, apparatus, and system for dynamic parallel aggregation with hybrid batch flushing are provided. Record sources of an aggregation operator in a query execution plan may dynamically aggregate using the same aggregation operator. The dynamic aggregation creates a batch of aggregation records from an input source, which are then used to aggregate further records from the input source. If a record from the input source is not matched to an aggregation record in the batch, then the record is passed to the next operator. In this manner, records are aggregated ahead of time at a record source to reduce the number of records passed between operators, reducing the impact of network I/O between nodes of a parallel processing system. By adjusting the contents of the batch according to aggregation performance monitored during run-time, hybrid batch flushing can be implemented to adapt to changing data patterns and skewed values.

    Abstract translation: 提供了一种用于混合批量冲洗的动态并行聚合的方法,装置和系统。 在查询执行计划中记录聚合运算符的源可以使用相同的聚合运算符动态聚合。 动态聚合从输入源创建一批聚合记录,然后用于汇总来自输入源的进一步记录。 如果输入源中的记录与批次中的聚合记录不匹配,则将该记录传递给下一个运算符。 以这种方式,记录在记录源上提前聚合,以减少运营商之间传递的记录数量,减少并行处理系统节点之间的网络I / O的影响。 通过根据运行时监控的聚合性能调整批量内容,可以实现混合批量冲洗,以适应不断变化的数据模式和偏斜值。

    Data-aware scalable parallel execution of rollup operations
    6.
    发明授权
    Data-aware scalable parallel execution of rollup operations 有权
    数据感知可扩展并行执行汇总操作

    公开(公告)号:US09235621B2

    公开(公告)日:2016-01-12

    申请号:US13754770

    申请日:2013-01-30

    CPC classification number: G06F17/30483 G06F17/30445

    Abstract: According to one aspect of the invention, for a database statement that specifies rollup operations, a data distribution key may be selected among a plurality of candidate keys. Numbers of distinct values of the candidate keys may be monitored with respect to a particular set of rows. Hash values may also be generated by column values in the candidate keys. The data distribution key may be determined based on results of monitoring the numbers of distinct values of the candidate keys as well as the frequencies of hash values computed based on column values of the candidate keys. Rollup operations may be shared between different stages of parallel executing processes and data may be distributed between the different stages of parallel executing processes based on the selected data distribution key.

    Abstract translation: 根据本发明的一个方面,对于指定汇总操作的数据库语句,可以在多个候选键中选择数据分配密钥。 可以相对于一组特定的行来监视候选键的不同值的数量。 哈希值也可以由候选键中的列值生成。 可以基于监视候选键的不同值的数量的结果以及基于候选键的列值计算的散列值的频率来确定数据分配密钥。 可以在并行执行过程的不同阶段之间共享汇总操作,并且可以基于所选择的数据分配密钥在并行执行过程的不同阶段之间分配数据。

    Scalable and adaptive evaluation of reporting window functions

    公开(公告)号:US09183252B2

    公开(公告)日:2015-11-10

    申请号:US13754687

    申请日:2013-01-30

    Abstract: According to one aspect of the invention, for a database statement that specifies evaluating reporting window functions, a computation-pushdown execution strategy may be used for the database statement. The computation-pushdown execution plan includes producer operators and consolidation operators. Each producer operator computes a respective partial aggregation for each reporting window function based on a subset of rows, and broadcasts the respective partial aggregation. Each consolidation operator fully aggregates all partial aggregations broadcasted from the producer operators. Alternatively, an extended-data-distribution-key execution plan may be used. Each producer operator sends rows based on hash keys to sort operators for computing partial aggregations for at least one reporting window function based on a subset of rows. Each consolidation operator receives and fully aggregates all partial aggregations broadcasted from the sort operators.

    Data-Aware Scalable Parallel Execution of Rollup Operations
    8.
    发明申请
    Data-Aware Scalable Parallel Execution of Rollup Operations 有权
    数据感知可扩展并行执行汇总操作

    公开(公告)号:US20140214800A1

    公开(公告)日:2014-07-31

    申请号:US13754770

    申请日:2013-01-30

    CPC classification number: G06F17/30483 G06F17/30445

    Abstract: According to one aspect of the invention, for a database statement that specifies rollup operations, a data distribution key may be selected among a plurality of candidate keys. Numbers of distinct values of the candidate keys may be monitored with respect to a particular set of rows. Hash values may also be generated by column values in the candidate keys. The data distribution key may be determined based on results of monitoring the numbers of distinct values of the candidate keys as well as the frequencies of hash values computed based on column values of the candidate keys. Rollup operations may be shared between different stages of parallel executing processes and data may be distributed between the different stages of parallel executing processes based on the selected data distribution key.

    Abstract translation: 根据本发明的一个方面,对于指定汇总操作的数据库语句,可以在多个候选键中选择数据分配密钥。 可以相对于一组特定的行来监视候选键的不同值的数量。 哈希值也可以由候选键中的列值生成。 可以基于监视候选键的不同值的数量的结果以及基于候选键的列值计算的散列值的频率来确定数据分配密钥。 可以在并行执行过程的不同阶段之间共享汇总操作,并且可以基于所选择的数据分配密钥在并行执行过程的不同阶段之间分配数据。

    Fusing global reporting aggregate computation with the underlying operation in the query tree for efficient evaluation

    公开(公告)号:US11036734B2

    公开(公告)日:2021-06-15

    申请号:US15063828

    申请日:2016-03-08

    Abstract: Techniques herein generate a query plan that combines a global reporting aggregate calculation and an organizing operation. A method detects an organizing operation, a group aggregate function, and a global aggregate function within a database statement. The organizing operation specifies organizational activities such as grouping, joining, or sorting rows. The method generates an execution plan that specifies calculating all values in a single pass. For each row, the single pass applies the organizing operation and updates an access structure. The pass updates one of multiple cumulative group calculations based on the group aggregate function and updates a cumulative global calculation based on the global aggregate function. Each cumulative group calculation is associated with some of the access structure. Based on the access structure, result rows that satisfy the database statement are generated. Result rows contain a final result of each group calculation and a final result of the global calculation.

    ADAPTIVE GRANULE GENERATION FOR PARALLEL QUERIES WITH RUN-TIME DATA PRUNING

    公开(公告)号:US20200026788A1

    公开(公告)日:2020-01-23

    申请号:US16039238

    申请日:2018-07-18

    Abstract: Techniques herein improve computational efficiency for parallel queries with run-time data pruning by using adaptive granule generation. In an embodiment, an execution plan is generated for a query to be executed by a plurality of slave processes, the execution plan comprising a plurality of plan operators. For a first plan operator of the plurality of plan operators, a first set of work granules is generated, and for a second plan operator of the plurality of plan operators, a second set of work granules is generated. A first subset of slave processes of the plurality of slave processes is assigned the first set of work granules. Based on the execution of the first set of work granules by the first subset of slave processes, a bloom filter is generated that specifies for which of said first set of work granules no output rows were generated. Based on the bloom filter, the second set of work granules is modified and the modified second set of work granules is assigned to a second subset of slave processes and executed.

Patent Agency Ranking