Dynamic parallel aggregation with hybrid batch flushing
    1.
    发明授权
    Dynamic parallel aggregation with hybrid batch flushing 有权
    动态并行聚合与混合批量冲洗

    公开(公告)号:US09460154B2

    公开(公告)日:2016-10-04

    申请号:US13705004

    申请日:2012-12-04

    CPC classification number: G06F17/30489

    Abstract: A method, apparatus, and system for dynamic parallel aggregation with hybrid batch flushing are provided. Record sources of an aggregation operator in a query execution plan may dynamically aggregate using the same aggregation operator. The dynamic aggregation creates a batch of aggregation records from an input source, which are then used to aggregate further records from the input source. If a record from the input source is not matched to an aggregation record in the batch, then the record is passed to the next operator. In this manner, records are aggregated ahead of time at a record source to reduce the number of records passed between operators, reducing the impact of network I/O between nodes of a parallel processing system. By adjusting the contents of the batch according to aggregation performance monitored during run-time, hybrid batch flushing can be implemented to adapt to changing data patterns and skewed values.

    Abstract translation: 提供了一种用于混合批量冲洗的动态并行聚合的方法,装置和系统。 在查询执行计划中记录聚合运算符的源可以使用相同的聚合运算符动态聚合。 动态聚合从输入源创建一批聚合记录,然后用于汇总来自输入源的进一步记录。 如果输入源中的记录与批次中的聚合记录不匹配,则将该记录传递给下一个运算符。 以这种方式,记录在记录源上提前聚合,以减少运营商之间传递的记录数量,减少并行处理系统节点之间的网络I / O的影响。 通过根据运行时监控的聚合性能调整批量内容,可以实现混合批量冲洗,以适应不断变化的数据模式和偏斜值。

    Scalable and adaptive evaluation of reporting window functions

    公开(公告)号:US09183252B2

    公开(公告)日:2015-11-10

    申请号:US13754687

    申请日:2013-01-30

    Abstract: According to one aspect of the invention, for a database statement that specifies evaluating reporting window functions, a computation-pushdown execution strategy may be used for the database statement. The computation-pushdown execution plan includes producer operators and consolidation operators. Each producer operator computes a respective partial aggregation for each reporting window function based on a subset of rows, and broadcasts the respective partial aggregation. Each consolidation operator fully aggregates all partial aggregations broadcasted from the producer operators. Alternatively, an extended-data-distribution-key execution plan may be used. Each producer operator sends rows based on hash keys to sort operators for computing partial aggregations for at least one reporting window function based on a subset of rows. Each consolidation operator receives and fully aggregates all partial aggregations broadcasted from the sort operators.

    BITMAP-BASED COUNT DISTINCT QUERY REWRITE IN A RELATIONAL SQL ALGEBRA

    公开(公告)号:US20210109930A1

    公开(公告)日:2021-04-15

    申请号:US16653639

    申请日:2019-10-15

    Abstract: Techniques are described for storing and maintaining, in a materialized view, bitmap data that represents a bitmap of each possible distinct value of an expression and rewriting a query for a count of distinct values of the expression using the materialized view. The materialized view contains bitmap data that represents a bitmap of each possible distinct value of a first expression, and aggregate values of additional expressions, and is stored in memory or on disk by a database system. The database system receives a query that requests a number of distinct values, of the first expression, and an aggregate value for an additional expression. In response, the database system, rewrites the query to: compute the number of distinct values by counting the bits in the bitmap data of the materialized view that are set to the first value, and obtains the aggregate value for the additional expression in the materialized view.

    Scalable multistage processing of queries with percentile functions

    公开(公告)号:US10719516B2

    公开(公告)日:2020-07-21

    申请号:US16113633

    申请日:2018-08-27

    Abstract: A method and system for processing database queries containing aggregate functions. The query may specify fewer groups than there are processes available to process the queries. Further, the queries may target a set of rows and specify a sort-by key and a group-by key. The method and system further includes determining that the queries specify application of the aggregate function to each of a plurality of groups that may correspond to a plurality of distinct values of the group-by key and determining that plurality of processes are available to process the queries. The method and system also includes determining the plurality of ranges of a composite key that may be formed by combining the group-by key and the sort-by key and assigning each range of the plurality ranges to a corresponding process to calculate the aggregate function.

    DYNAMIC PARALLEL AGGREGATION WITH HYBRID BATCH FLUSHING
    5.
    发明申请
    DYNAMIC PARALLEL AGGREGATION WITH HYBRID BATCH FLUSHING 有权
    动态平行混合混合混合冲洗

    公开(公告)号:US20140156636A1

    公开(公告)日:2014-06-05

    申请号:US13705004

    申请日:2012-12-04

    CPC classification number: G06F17/30489

    Abstract: A method, apparatus, and system for dynamic parallel aggregation with hybrid batch flushing are provided. Record sources of an aggregation operator in a query execution plan may dynamically aggregate using the same aggregation operator. The dynamic aggregation creates a batch of aggregation records from an input source, which are then used to aggregate further records from the input source. If a record from the input source is not matched to an aggregation record in the batch, then the record is passed to the next operator. In this manner, records are aggregated ahead of time at a record source to reduce the number of records passed between operators, reducing the impact of network I/O between nodes of a parallel processing system. By adjusting the contents of the batch according to aggregation performance monitored during run-time, hybrid batch flushing can be implemented to adapt to changing data patterns and skewed values.

    Abstract translation: 提供了一种用于混合批量冲洗的动态并行聚合的方法,装置和系统。 在查询执行计划中记录聚合运算符的源可以使用相同的聚合运算符动态聚合。 动态聚合从输入源创建一批聚合记录,然后用于汇总来自输入源的进一步记录。 如果输入源中的记录与批次中的聚合记录不匹配,则将该记录传递给下一个运算符。 以这种方式,记录在记录源上提前聚合,以减少运营商之间传递的记录数量,减少并行处理系统节点之间的网络I / O的影响。 通过根据运行时监控的聚合性能调整批量内容,可以实现混合批量冲洗,以适应不断变化的数据模式和偏斜值。

    Parallel processing of queries with inverse distribution function

    公开(公告)号:US11176131B2

    公开(公告)日:2021-11-16

    申请号:US16449382

    申请日:2019-06-22

    Abstract: Techniques are described for parallel processing of database queries with an inverse distribution function by a database management system (DBMS). To improve the execution time of a query with an inverse distribution function, the data set referenced in the inverse distribution function is range distributed among parallel processes that are spawned and managed by a query execution coordinator process (QC), in an embodiment. The parallel executing processes sort each range of the data set in parallel, while the QC determines the location(s) of inverse distribution function values based on the count of values in each range of the data set. The QC requests the parallel processes to produce to the next stage of parallel processes the values at the location(s) in the sorted ranges. The next stage of parallel processes computes the inverse distribution function based on the produced values. Techniques are also described for parallel executing of queries that may additionally include another inverse distribution function, one or more non-distinct aggregate functions and one or more distinct aggregate functions.

    Parallel processing of queries with inverse distribution function

    公开(公告)号:US10366082B2

    公开(公告)日:2019-07-30

    申请号:US15375023

    申请日:2016-12-09

    Abstract: Techniques are described for parallel processing of database queries with an inverse distribution function by a database management system (DBMS). To improve the execution time of a query with an inverse distribution function, the data set referenced in the inverse distribution function is range distributed among parallel processes that are spawned and managed by a query execution coordinator process (QC), in an embodiment. The parallel executing processes sort each range of the data set in parallel, while the QC determines the location(s) of inverse distribution function values based on the count of values in each range of the data set. The QC requests the parallel processes to produce to the next stage of parallel processes the values at the location(s) in the sorted ranges. The next stage of parallel processes computes the inverse distribution function based on the produced values. Techniques are also described for parallel executing of queries that may additionally include another inverse distribution function, one or more non-distinct aggregate functions and one or more distinct aggregate functions.

    Enhancing Parallelism in Evaluation Ranking/Cumulative Window Functions
    8.
    发明申请
    Enhancing Parallelism in Evaluation Ranking/Cumulative Window Functions 有权
    评估排名/累积窗口函数增强并行度

    公开(公告)号:US20140214799A1

    公开(公告)日:2014-07-31

    申请号:US13754740

    申请日:2013-01-30

    CPC classification number: G06F17/30445

    Abstract: According to one aspect of the invention, for a database statement that specifies evaluating ranking or cumulative window functions, an execution strategy based on an extended data distribution key may be used for the database statement. In the execution strategy, each sort operator of multiple parallel processing sort operators computes locally evaluated results of a ranking or cumulative window function based on a subset of rows in all rows used to evaluate the database statement, and sends the first and last rows' locally evaluated results to a query coordinator. The query coordinator consolidates the locally evaluated results received from the multiple parallel processing sort operators and sends consolidated results to the sort operators based on their respective demographics. Each sort operator completes full evaluation of the ranking or cumulative window functions based at least in part on one or more of the consolidated results provided by the query coordinator.

    Abstract translation: 根据本发明的一个方面,对于指定评估排名或累积窗口函数的数据库语句,可以对数据库语句使用基于扩展数据分配密钥的执行策略。 在执行策略中,多个并行处理排序运算符的每个排序运算符基于用于评估数据库语句的所有行中的行的子集来计算排名或累积窗口函数的本地评估结果,并将第一行和最后一行“ 评估结果给查询协调器。 查询协调器整合从多个并行处理排序运算符接收的本地评估结果,并根据各自的人口统计特征将综合结果发送给排序运算符。 每个排序运算符至少部分地基于查询协调器提供的一个或多个综合结果完成对排名或累积窗函数的完全评估。

    SCALABLE MULTISTAGE PROCESSING OF QUERIES WITH PERCENTILE FUNCTIONS

    公开(公告)号:US20200065413A1

    公开(公告)日:2020-02-27

    申请号:US16113633

    申请日:2018-08-27

    Abstract: A method and system for processing database queries containing aggregate functions. The query may specify fewer groups than there are processes available to process the queries. Further, the queries may target a set of rows and specify a sort-by key and a group-by key. The method and system further includes determining that the queries specify application of the aggregate function to each of a plurality of groups that may correspond to a plurality of distinct values of the group-by key and determining that plurality of processes are available to process the queries. The method and system also includes determining the plurality of ranges of a composite key that may be formed by combining the group-by key and the sort-by key and assigning each range of the plurality ranges to a corresponding process to calculate the aggregate function.

    Scalable and adaptive evaluation of reporting window functions
    10.
    发明授权
    Scalable and adaptive evaluation of reporting window functions 有权
    报告窗口功能的可扩展和适应性评估

    公开(公告)号:US09390129B2

    公开(公告)日:2016-07-12

    申请号:US13754687

    申请日:2013-01-30

    CPC classification number: G06F17/30433 G06F17/30442 G06F17/30471

    Abstract: According to one aspect of the invention, for a database statement that specifies evaluating reporting window functions, a computation-pushdown execution strategy may be used for the database statement. The computation-pushdown execution plan includes producer operators and consolidation operators. Each producer operator computes a respective partial aggregation for each reporting window function based on a subset of rows, and broadcasts the respective partial aggregation. Each consolidation operator fully aggregates all partial aggregations broadcasted from the producer operators. Alternatively, an extended-data-distribution-key execution plan may be used. Each producer operator sends rows based on hash keys to sort operators for computing partial aggregations for at least one reporting window function based on a subset of rows. Each consolidation operator receives and fully aggregates all partial aggregations broadcasted from the sort operators.

    Abstract translation: 根据本发明的一个方面,对于指定评估报告窗口功能的数据库语句,可以对数据库语句使用计算下推执行策略。 计算下推执行计划包括生产者运营商和合并运营商。 每个生成者操作者基于行的子集来计算每个报告窗口函数的相应部分聚合,并广播相应的部分聚合。 每个合并运营商完全聚合从生产者运营商广播的所有部分聚合。 或者,可以使用扩展数据分发密钥执行计划。 每个制作者操作员根据散列键发送行,对运算符进行排序,以便根据行的子集为至少一个报告窗口函数计算部分聚合。 每个合并运算符接收并完全聚合从排序运算符广播的所有部分聚合。

Patent Agency Ranking