Abstract:
A new type of table join operation, outer semi join (OSJ), is provided, which can be used by an optimizer layer and an execution layer of a database management system (DBMS). OSJ combines the semantics of both left outer-join and semi-join. The concept of an anti-join marker (AJM) is also introduced, which specifies whether a matching row was not found between joined tables for each result row in an OSJ operation. The OSJ operation supports unnesting of a class of disjunctive ANY, ALL, EXISTS, NOT EXISTS, IN, and NOT IN subqueries for execution plan optimization. The disjunction may contain filter predicates. For unnesting, OSJ avoids the need of using a distinct operator on the right table and also supports using inequality (e.g. >, >=,
Abstract:
Techniques for automatically preventing execution plan regressions are provided. In one technique, in a first user database session, in response to receiving a first database statement, a first execution plan is generated and, while executing the first execution plan, first performance data that indicates one or more first performance metrics of executing the first execution plan is recorded. In response to receiving a second database statement, where the first execution plan may be used to generate a result for the second database statement, a second execution plan is generated and second performance data that indicates one or more second performance metrics of executing the second execution plan is recorded. A comparison between the first performance data and the second performance data is performed. Based on the comparison, it is determined whether the second execution plan will be stored for future use to process a database statement.
Abstract:
Techniques for automatically preventing execution plan regressions are provided. In one technique, in a first user database session, in response to receiving a first database statement, a first execution plan is generated and, while executing the first execution plan, first performance data that indicates one or more first performance metrics of executing the first execution plan is recorded. In response to receiving a second database statement, where the first execution plan may be used to generate a result for the second database statement, a second execution plan is generated and second performance data that indicates one or more second performance metrics of executing the second execution plan is recorded. A comparison between the first performance data and the second performance data is performed. Based on the comparison, it is determined whether the second execution plan will be stored for future use to process a database statement.
Abstract:
Techniques for the automatic creation and maintenance of zone maps are provided. In one technique, a set of data sets is identified. For each data set, a data set width is determined based on a maximum value in the data set and a minimum value in the data set. One or more zones within the data set are identified. For each zone, a zone width is determined based on a difference between a maximum value in that zone and a minimum value in that zone. An aggregate zone width is generated that is based on the zone width of each zone. Based on the data set width and the aggregate zone width, it is determined whether to automatically generate a zone map for the data set.
Abstract:
Techniques are provided for using of zone maps to improve the performance of a much wider range of queries than those for which zone maps are currently used. Specifically, techniques are provided for using zone maps to improve performance of queries by providing aggregate values for a wide range of aggregate operations, including SUM, AVG, etc., providing aggregate values for aggregate queries that specify filter conditions, distinguishing between situations in which the aggregate values for a zone are invalid for pruning purposes and when the aggregate values are invalid for query-answering purposes, determining when aggregate values may be used in multi-table zone maps where the type of join specified by a query differs from the type of join used to generate the aggregate values in the zone map, and selecting among different aggregate values for the same zone based on the type of join specified in a query.
Abstract:
Embodiments generate random walks through a directed graph that is represented in a relational database table. Each row of the graph table represents a directed edge in the graph and includes a source vertex and a destination vertex. Each row is further augmented to (a) indicate the number of outbound edges starting from the destination vertex in the row and (b) include an identifier that distinguishes the edge from other outbound edges starting from the same source vertex. An SQL query may be executed on the augmented graph table. Starting from a source vertex (starting vertex or the destination vertex of the previously selected hop) the query randomly selects a row of the graph table representing one of the outbound edges from the source vertex and adds the selected outbound edge as a row in a random walk table that represents the next hop in the random walk.
Abstract:
Techniques related to efficient data retrieval based on aggregate characteristics of composite tables are provided. A join zone map includes entries that describe data from a join relationship between a first key column of a first table and a second key column of a second table. The first table includes a dimension column. Each entry of the join zone map corresponds to a respective zone. Each zone includes contiguous data blocks that correspond to one or more second key column values. Each entry also includes a respective dimension value range of one or more dimension column values. Each dimension value range includes a respective maximum dimension value and a respective minimum dimension value. Furthermore, each entry includes a respective anti-join attribute value that indicates whether any of the one or more second key column values in a particular zone are non-null and fail to match any first key column values.
Abstract:
Techniques related to efficient data retrieval based on aggregate characteristics of composite tables are provided. A join zone map includes entries that describe data from a join relationship between a first key column of a first table and a second key column of a second table. The first table includes a dimension column. Each entry of the join zone map corresponds to a respective zone. Each zone includes contiguous data blocks that correspond to one or more second key column values. Each entry also includes a respective dimension value range of one or more dimension column values. Each dimension value range includes a respective maximum dimension value and a respective minimum dimension value. Furthermore, each entry includes a respective anti-join attribute value that indicates whether any of the one or more second key column values in a particular zone are non-null and fail to match any first key column values.
Abstract:
Techniques are described herein to generate and to execute a query execution plan using static data buffering. After receiving a query with a clause that requires multiple iterations to execute, a database management system (DBMS) generates a plurality of plans that vary the order in which the database operations are executed. Within each plan, the DBMS identifies sets of rows within that plan that contain static data during execution of the query. Then, an additional step is added to each plan that includes loading the static set of rows in a database buffer cache. One or more database operations, from an iteration other than the first iteration, may be performed against the cached static set of rows. For each plan generated in this manner, a cost analysis model is applied, and the plan with the lowest estimated computational cost is selected for use as the query execution plan.
Abstract:
Techniques for partition pruning based on aggregated zone map information. In one embodiment, for example, a method for pruning partitions based on aggregated zone map information comprises: receiving a query statement comprising a filter predicate on a column of a database table; and pruning one or more partitions of the database table from access paths for processing the query statement based on determining, based on aggregated zone map information associated with the one or more partitions, that the filter predicate cannot be satisfied by data stored in the one or more partitions.