摘要:
A method and system are provided for processing queries. According to one aspect of the invention, a query that does not reference a particular materialized view is rewritten to reference the materialized view. In particular, upon receiving the query, it is determined whether the particular materialized view satisfies each condition in a set of conditions, where the set of conditions at least includes a condition that the materialized view reflects all rows that exist in a common section. The common section is a section of the query that is common to both the materialized view and the query. If the materialized view satisfies each condition in the set of conditions, then the query is rewritten to produce a rewritten query that references the materialized view. The materialized view may be a summary table that includes a summary column. The summary column contains values generated by aggregating values contained in rows produced by a one-to-many lossless join. The one-to-many lossless join is not in the common section. The query includes a cumulative aggregate function. Under these conditions, the method includes generating results of the cumulative aggregate function in the query by dividing values from the summary column by scaling factors.
摘要:
Methods for collecting query workload based statistics within a relational database management system (RDBMS) and for identifying columns for which statistics collection is to be performed. The novel system collects workload statistics that are dependent on multiple columns, rather than merely single columns. Multi-column statistic generation provides more accurate results for columns having correlated data, and therefore leads to better estimated cost analysis by an RDBMS optimizer. In one embodiment, a column duplicity factor is based on an analysis of distinct data rows, e.g., combinations of values within multiple columns, rather than rows of single columns. The novel system also collects separate statistics regarding the presence of null data within the rows of a column group. Separate null data statistics improve the determined result carnality used by the RDBMS optimizer because the cardinality of a relational operation's result is generally determined by the number of input rows with non-null data. The novel system includes an RDBMS optimizer that automatically identifies column groups and column groups on which workload statistics are to be generated. The parameters within a query (e.g., equi-joins, equi-selections, and projections) are analyzed by the optimizer to automatically identify the column groups. The identified columns are then registered within in a system catalog. The registered column groups are read by statistics generation procedures to identify those column groups for which workload statistics are to be collected.
摘要:
Described herein are approaches to implementing dynamic sampling in a way that lessens or eliminates the additional overhead incurred to perform dynamic sampling. Also described are techniques for determining characteristics about predicates not previously determined by conventional techniques for dynamic sampling. Dynamic sampling is used by a query optimizer to dynamically estimate predicate selectivities and statistics. When a database statement is received by a database server, an initial analysis of the database statement is made to determine the efficacy of dynamic sampling, that is, to determine whether optimization of the query would benefit from dynamic sampling and whether performance is not excessively impacted by the dynamic sampling process. If this analysis determines dynamic sampling should be used, then dynamic sampling is undertaken.
摘要:
An access structure analysis method is interspersed with the query optimization process. The method can determine the ideal combination of access structures, including both materialized views and indexes, for a given database workload. The interspersed structure analysis method can include advanced transformations like view merging, star transformation, bitmap access plans, query rewrite using materialized views, for example. The method may be performed using the query optimizer's rules as heuristics to guide the index candidate generation process.
摘要:
Techniques are described which allow function-defined hierarchies to be registered with a database server. The information provided to the server during the registration process is used by the server to determine how to roll up data that has been aggregated at one level of a function-defined hierarchy to another level of the function-defined hierarchy. Techniques are also provided to perform rollup from one level of a function-defined hierarchy to another level of the function-defined hierarchy on data stored in a materialized view. Further, techniques are provided for rewriting queries that require aggregation at one level of a function-defined hierarchy to cause them to access data from a materialized view that stores data at a different level of the function-defined hierarchy.
摘要:
An access structure analysis method is interspersed with the query optimization process. The method can determine the ideal combination of access structures, including both materialized views and indexes, for a given database workload. The interspersed structure analysis method can include advanced transformations like view merging, star transformation, bitmap access plans, query rewrite using materialized views, for example. The method may be performed using the query optimizer's rules as heuristics to guide the index candidate generation process.
摘要:
An incremental refresh of a materialized view may be simplified, and therefore made more cost efficient, by reducing the number of DML operations being merged with the materialized view during the incremental refresh. Specifically, subsequences of sequences of data manipulation language operations that have been recorded for a particular row of a base table may be inspected to determine whether the subsequences conform to particular patterns of data manipulation language operator types. If a subsequence conforms to one of the particular patterns, the subsequence may be replaced with a single substitute: either a single data manipulation language operation, or null. Refresh operations that are generated based on the simplified sequences of data manipulation language operations are more simple, and therefore, less costly to perform.
摘要:
Approaches, techniques, and mechanisms are disclosed for maintaining a set of baseline query plans for a database command. Except in rare circumstances, a database server may only execute a command according to a baseline plan, even if the database server predicts that a different plan has a lower cost. The set of baseline plans are plans that, for one reason or another, have been determined to provide acceptable actual performance in at least one execution context. When the database server receives a request to execute a particular command, the database server, if possible, always executes the command according to the lowest predicted cost baseline plan. The database server may evolve the plan baseline to include additional plans by generating and testing new plans in response to new requests to execute the database command, or as part of a query optimization or tuning process.
摘要:
Queries are optimized according to a first optimization mode by generating execution plans and selecting the lowest cost plan. Inputs optimized according to the first optimization mode, to database operations with input parameters that are inconsistent with the first optimization mode, are replaced with equivalent inputs optimized according to a second optimization mode, the second optimization mode being consistent with the input parameter. Blocking operations are eliminated from queries using a cost-based approach.
摘要:
A method and apparatus for refreshing stale materialized views is provided. Prior to executing a query to refresh a materialized view from data in the base tables of the materialized view, the query is rewritten to refresh the materialized view from data in one or more other materialized views. To take advantage of the efficiency gained by refreshing a materialized view based on another materialized view, a refresh sequence is established based on the dependencies between materialized views in the database system. The dependencies indicate which materialized views can be refreshed from which other materialized views. When a materialized view can be refreshed based on any one of a number of eligible materialized views, the refresh sequence may additionally take into account the relative benefit associated with refreshing the materialized view with each of the eligible materialized views.