摘要:
Techniques are described for combining pieces of information from two sources. The techniques may be used to improve the performance, for example, of hash join operations that are parallelized using slaves distributed across multiple nodes. According to one technique, bitmap filtering operations are performed by the probe-phase producer slaves, rather than the probe-phase consumer slaves. To avoid having to merge separately built bitmap filter chunks, the left-hand rows may be sent to every probe-phase consumer slave. Alternatively, the merge operation may be avoided by distributing the rows of one source based on how the other source has been statically partitioned.
摘要:
A query coordinator handles a multiple-server dynamic performance query by sending remote query slaves (1) first information for generating a complete plan for the query, and (2) second information for participating in the dynamic performance view portion of the query. If the slaves on the remote servers are unable to use the first information to generate an equivalent query (for example, if they reside in a database server that has closed the database), then the slaves on the remote servers use the second information to participate in the dynamic performance view portion of the query.
摘要:
Techniques for automatically recommending parallel execution of a SQL statement. In one set of embodiments, a first determination can be made regarding whether a SQL statement can be executed in parallel. Further, a second determination can be made regarding whether executing the SQL statement in parallel is faster than executing the statement in serial by a predetermined factor. If the first determination and second determination are positive (i.e., the statement can be executed in parallel and parallel execution is faster by the predetermined factor), a recommendation can be provided indicating that the SQL statement should be executed in parallel. In some embodiments, the recommendation can include a report specifying the degree of performance improvement gained from parallel execution, additional system resources consumed by parallel execution, and other statistics pertaining to the recommended parallel execution plan.
摘要:
Techniques for automatically recommending parallel execution of a SQL statement. In one set of embodiments, a first determination can be made regarding whether a SQL statement can be executed in parallel. Further, a second determination can be made regarding whether executing the SQL statement in parallel is faster than executing the statement in serial by a predetermined factor. If the first determination and second determination are positive (i.e., the statement can be executed in parallel and parallel execution is faster by the predetermined factor), a recommendation can be provided indicating that the SQL statement should be executed in parallel. In some embodiments, the recommendation can include a report specifying the degree of performance improvement gained from parallel execution, additional system resources consumed by parallel execution, and other statistics pertaining to the recommended parallel execution plan.
摘要:
Techniques are provided for executing database statements, or portions thereof, in parallel without using slave SQL to communicate to each slave the operations to be performed by the slave. Techniques are provided for incorporating within a shared cursor the code fragments that govern both sides of the interaction between a query coordinator (QC) and remotely-located slaves. Further, techniques are provided for the QC to communicate with each slave on how and which portions of the execution plan to execute and when. A state-transition engine for slave execution under the control of the query-coordinator is also provided.
摘要:
Systems, methods, and other embodiments associated with hybrid optimization strategies in automatic SQL tuning are described. One example method includes receiving a first (e.g., cost-based) execution plan for a user structured query language statement (User SQL) from a first (e.g., cost-based) optimizer. The example method may also include receiving a second (e.g., rules-based) execution plan for the User SQL from a second, different (e.g., rules-based) query optimizer. The method may include identifying a preferred execution plan based on data produced by test executing the execution plans in a reproduced execution environment that reproduces at least a portion of an execution environment in which the user SQL runs. The method may also include controlling a database to execute the User SQL using the preferred execution plan.
摘要:
A method and apparatus for approximating a database statistic, such as the number of distinct values (NDV) is provided. To approximate the NDV for a portion of a table, a synopsis of distinct values is constructed. Each value in the portion is mapped to a domain of values. The mapping function is implemented with a uniform hash function, in one embodiment. If the resultant domain value does not exist in the synopsis, the domain value is added to the synopsis. If the synopsis reaches its capacity, a portion of the domain values are discarded from the synopsis. The statistic is approximated based on the number (N) of domain values in the synopsis and the portion of the domain that is represented in the synopsis relative to the size of the domain.
摘要:
A method and apparatus for performing recursive database operations is provided. According to one aspect, a plurality of first-stage slaves and a plurality of second-stage slaves are established in a database server. During one or more iterations of a recursive database operation, the first-stage slaves concurrently process data items stored in a data repository and send results to the second-stage slaves. The second-stage slaves receive the results and concurrently process those results. The second-stage slaves store the results of the second-stage slaves' processing in the data repository. Subsequent iterations of the recursive database operation proceed in this manner until the recursive database operation has been completed. In each iteration, the first-stage slaves consume the product of the second-stage slaves' previous iteration's processing, and the second-stage slaves consume the product of the first-stage slaves' current iteration's processing.
摘要:
Techniques are provided for performing a parallel aggregation operation on data that resides in a container, such as a relational table. During generation of the execution plan for the operation, it is determined whether partition-wise aggregation should be performed, based on the grouping keys involved in the aggregation and the partition keys used to partition the container. If partition-wise aggregation is to be performed, then the assignments given to the slave processes that are assigned to scan a container are made on a partition-wise basis. The scan slaves themselves may perform full or partial aggregation (depending on whether they are the only scan slaves assigned to the partition). If the scan slaves perform no aggregation, or only partial aggregation, then the scan slaves redistribute the data items to aggregation slaves that are local to the scan slaves.
摘要:
Execution of a restartable sub-tree of a query execution plan comprises determining whether use of parallel processes is a preferred or optimal mode of executing the sub-tree. The determination is based, at least in part, on how long it takes to restart the sub-tree using two or more parallel processes and/or how long it takes to probe the sub-tree, i.e., to fetch a row that meets one or more conditions or correlations associated with the sub-query, using the two or more parallel processes. Thus, a dynamic computational cost-based operation is described, which determines at query runtime whether to execute the restartable sub-tree using a single server process or multiple parallel server processes.