摘要:
A technique and mechanism for transforming a query is described. A given query is transformed into a transformed query that references a global temporary table. Specifically, the given query contains a join between a given table and one or more other tables referenced by constraints (e.g. predicates). References to one or more of the constrained tables are replaced by a reference to a global temporary table. Before executing the transformed query, data that satisfies the constraints placed on the constrained table is inserted into the global temporary table.
摘要:
A method and apparatus for processing star queries is provided. According to the method, a star query is transformed by adding to the star query subqueries that are not in the query. The subqueries are generated based on join predicates and constraints on dimension tables that are contained in the original query. The subqueries are executed, and the values returned by the subqueries are used to access one or more bitmap indexes built on columns of the fact table. The bitmaps retrieved for the values returned by each subquery are merged to create one subquery bitmap per subquery. An AND operation is performed on the subquery bitmaps, and the resulting bitmap is used to determine which data to retrieve from the fact table.
摘要:
Techniques are provided for executing distinct aggregation operations in a manner that is more scalable and efficient than prior techniques. A three-stage technique is provided to parallelize aggregation operations that involve both grouping and multiple distinct-key columns. Such queries are handled by splitting rows into as many pieces as there are distinct aggregates in the query, and processing the row pieces. During the first-stage, a set of slave processes scans the rows of the base tables and performs partial duplicate elimination. During the second-stage, a set of slave processes completes the duplicate elimination and performs partial set function aggregation. During the third-stage, a third set of slave processes completes the set aggregation to produce the results of the distinct aggregation operation. In addition, two-stage parallelization techniques are provided for parallelizing single-distinct aggregations, and for parallelizing distinct aggregation operations that involve multiple distinct-key columns, but do not require grouping.
摘要:
Several techniques for sorting item are described, generally referred to as (1) common prefix skipping quicksort; (2) key substring caching; and (3) adaptive quicksort. With common prefix skipping quicksort, common prefix bytes among all key values for a partition are computed while performing a quicksort partitioning operation, and the known common bytes are skipped when comparing two key values in a recursive partitioning operation. With key substring caching, each item is represented in a cached array comprising a particular number of bytes for respective portions of key values (“key substring”), where the key substring cache is updated contain bytes beyond the known number of common prefix bytes. An adaptive quicksort routine is a hybrid of a quicksort function and most significant digit radix sort function, where the functions are mutually recursive.
摘要:
A method, system and product for coordinating a parallel update for a global index of an indexed table involves a coordinator process and slave processes. The coordinator process receives index maintenance records from data manipulation slaves for an indexed table. Each index maintenance record includes a value for an index key of a global index of the table. The coordinator process computes index key value ranges and sends each range to an index update slave. Each slave updates the global index using just the index maintenance records with key values in its respective range, thus avoiding contention among the slaves and increasing clustering so that scaleable parallelism may be more closely attained. Techniques are also described for deferring the maintenance of global indexes relative to the time when the table on which they are built is changed.
摘要:
A method and apparatus are provided for allocating buffer memory for database sort operations. A database parameter is set to determine whether and how direct write buffers are to be allocated for sort operations. If the parameter is set to a first value, then direct write buffers will be used to perform writes to disk. The size and number of direct write buffers to be used will be determined by the values set in other database parameters. If the parameter is set to a second value, then no direct write buffers will be used, and sort operations will write to disk through a buffer cache. If the parameter is set to a third value, direct write buffers will be allocated a portion of the memory available to perform the sort operation. The size and number of direct write buffers will be determined in accordance with database formulae that are designed to optimize sort and data write performance.
摘要:
A method and apparatus for batch processing of updates to indexes is provided. A plurality of index update records are generated that identify a plurality of index update operations to be made to an index. The plurality of index entry records are sorted and then applied, in an order that corresponds to the sort order, in batches to their respective indexes. As a result of performing batch processing of updates to indexes, the number of disk I/Os will be greatly reduced, freeing database system resources to perform other tasks. The overall efficiency of index maintenance is also improved, as is the recovery of the database system after a failure, resultant from an ordering of the index maintenance operations and a partial ordering of the persistent redo log file.
摘要:
A method and apparatus are provided for allocating buffer memory for database sort operations. A database parameter is set to determine whether and how direct write buffers are to be allocated for sort operations. If the parameter is set to a first value, then no direct write buffers will be used, and sort operations will write to disk through a buffer cache. If the parameter is set to a second value, then direct write buffers will be used to perform writes to disk. Then size and number of direct write buffers to be used will be determined by the values set in other database parameters. If the parameter is set to a third value, direct write buffers will be allocated a portion of the memory available to perform the sort operation. The size and number of direct write buffers will be determined in accordance with database formulae that are designed to optimize sort and data write performance.
摘要:
Techniques are described for combining pieces of information from two sources. The techniques may be used to improve the performance, for example, of hash join operations that are parallelized using slaves distributed across multiple nodes. According to one technique, bitmap filtering operations are performed by the probe-phase producer slaves, rather than the probe-phase consumer slaves. To avoid having to merge separately built bitmap filter chunks, the left-hand rows may be sent to every probe-phase consumer slave. Alternatively, the merge operation may be avoided by distributing the rows of one source based on how the other source has been statically partitioned.
摘要:
A method and apparatus for parallelizing operations that change a database is provided. A coordinator process receives a statement that requires data to be written into the database. In response to the statement, the coordinator process assigns granules of work to multiple processes. Each of the multiple processes executes the granule of work by writing to the database a distinct portion of the set of data that is to be added to the database. The various portions of data are then merged to update the set of data to be added to the database.