Abstract:
According to one aspect of the invention, for a database statement that specifies evaluating reporting window functions, a computation-pushdown execution strategy may be used for the database statement. The computation-pushdown execution plan includes producer operators and consolidation operators. Each producer operator computes a respective partial aggregation for each reporting window function based on a subset of rows, and broadcasts the respective partial aggregation. Each consolidation operator fully aggregates all partial aggregations broadcasted from the producer operators. Alternatively, an extended-data-distribution-key execution plan may be used. Each producer operator sends rows based on hash keys to sort operators for computing partial aggregations for at least one reporting window function based on a subset of rows. Each consolidation operator receives and fully aggregates all partial aggregations broadcasted from the sort operators.
Abstract:
Computer-implemented techniques for hash-based set operations. In some embodiments, the techniques are implemented in a computer database management system to improve the computational space or time efficiency of executing database query language statements that contain one or more set operations. With the hash-based techniques, duplicate record elimination and aggregation of the component query result sets is not required before combining the sets in a set operation as the set operation itself performs aggregation on the records. As a result, the computational efficiency of performing the set operation is improved over a sort-based approach where a component query result set is not pre-sorted.
Abstract:
Techniques are described for storing and maintaining, in a materialized view, bitmap data that represents a bitmap of each possible distinct value of an expression and rewriting a query for a count of distinct values of the expression using the materialized view. The materialized view contains bitmap data that represents a bitmap of each possible distinct value of a first expression, and aggregate values of additional expressions, and is stored in memory or on disk by a database system. The database system receives a query that requests a number of distinct values, of the first expression, and an aggregate value for an additional expression. In response, the database system, rewrites the query to: compute the number of distinct values by counting the bits in the bitmap data of the materialized view that are set to the first value, and obtains the aggregate value for the additional expression in the materialized view.
Abstract:
Techniques are described for leveraging column dictionaries of tables for join, group-by and expression evaluation operations. In an embodiment, a table is stored in one or more data units, each data unit's metadata containing dictionaries for stored columns. Rather than storing unencoded column values, the data units may store columns as column vectors of dictionary-encoded values, in an embodiment. When performing a join operation, a matching of values may be performed on the build-side table using the unencoded, unencoded, values stored in the join-key dictionary(s) of the probe-side table, thus, significantly reducing the number of searching and matching operations. In an embodiment, a group-by operation may be executed by performing partial aggregations based on unique group-by key values as stored in the one or more group-by key dictionaries. For an expression evaluation, only a single evaluation may be performed for each unique combination of expression-key values in a data unit by leveraging the one or more expression-key dictionaries.
Abstract:
According to one aspect of the invention, for a database statement that specifies evaluating ranking or cumulative window functions, an execution strategy based on an extended data distribution key may be used for the database statement. In the execution strategy, each sort operator of multiple parallel processing sort operators computes locally evaluated results of a ranking or cumulative window function based on a subset of rows in all rows used to evaluate the database statement, and sends the first and last rows' locally evaluated results to a query coordinator. The query coordinator consolidates the locally evaluated results received from the multiple parallel processing sort operators and sends consolidated results to the sort operators based on their respective demographics. Each sort operator completes full evaluation of the ranking or cumulative window functions based at least in part on one or more of the consolidated results provided by the query coordinator.