摘要:
Various embodiments of the invention provide solutions to allow more sophisticated management of the relationship between a database and its clients (which can be, inter alia, end users, business applications, etc.). Merely by way of example, some embodiments can facilitate the management of work requests in a database, as well as the management of the quality-of-service in a database system. In some embodiments, an identification handle may be assigned to a database work request. A database management application can use the identification handle to identify the work request, as well, perhaps, as any related work requests. The identification handle may also identify the database (and/or an instance thereof) and/or a clustered database node, and the identification handle may be transmitted to a mid-tier application, e.g., to notify the mid-tier about the processing of the work request, changes in quality-of service, server availability, etc.
摘要:
A query coordinator handles a multiple-server dynamic performance query by sending remote query slaves (1) first information for generating a complete plan for the query, and (2) second information for participating in the dynamic performance view portion of the query. If the slaves on the remote servers are unable to use the first information to generate an equivalent query (for example, if they reside in a database server that has closed the database), then the slaves on the remote servers use the second information to participate in the dynamic performance view portion of the query.
摘要:
A method and apparatus for performing recursive database operations is provided. According to one aspect, a plurality of first-stage slaves and a plurality of second-stage slaves are established in a database server. During one or more iterations of a recursive database operation, the first-stage slaves concurrently process data items stored in a data repository and send results to the second-stage slaves. The second-stage slaves receive the results and concurrently process those results. The second-stage slaves store the results of the second-stage slaves' processing in the data repository. Subsequent iterations of the recursive database operation proceed in this manner until the recursive database operation has been completed. In each iteration, the first-stage slaves consume the product of the second-stage slaves' previous iteration's processing, and the second-stage slaves consume the product of the first-stage slaves' current iteration's processing.
摘要:
Techniques are provided for evenly distributing data items of a particular set of data to a plurality of buckets. The buckets of data items may then be assigned to processes to perform operations on the data items in parallel with the other processes. In one embodiment, the set of data (which may come from tables or be the result set of a previous operation) is divided into a plurality of subsets. For each subset of the plurality of subsets, a sample of data items is randomly selected. The sampling itself may be performed in parallel, with each sampling process using a different seed to randomize its selection of samples. The sampled data items are sorted and ranges are determined based on distribution keys of the sampled data items. The ranges are assigned to buckets, and the data items are then distributed to the buckets assigned to the range into which their distribution key falls.
摘要:
Methods are provided for automatically discovering correlations between values in columns of tables. A set of significantly correlated columns is identified by identifying correlated columns, and by determining the significance of the correlation between the correlated columns from one or more tables. If the correlated columns are considered significantly correlated, a correlation table is constructed that includes records representing distinct combinations of values corresponding to the correlated columns. Embodiments include methods for identifying correlated columns, for determining the significance of the correlation between the correlated columns, and for using the resultant correlation table to enhance performance of a query execution process. One particular embodiment provides for using a correlation table for partition pruning a partitioned table, with respect to a query execution plan.
摘要:
The present invention is directed to a method and mechanism for partitioning using information not directly located in the object being partitioned. According to an embodiment of the invention, foreign key-primary key relationships are utilized to create join conditions between multiple database tables to implement partitioning of a database object. Also, disclosed are methods and mechanisms to perform partition pruning.
摘要:
Auto-tuning can be performed by receiving a database query language statement and performance information related to the statement, determining whether one or more performance statistics of the statement are available or missing in the performance information, and determining an auto-tuning hint for each missing statistic.
摘要:
A self-managing workload repository (AWR) infrastructure useful for a database server to collect and manage selected sets of important system performance statistics. Based on a schedule, the AWR runs automatically to collect data about the operation of the database system, and stores the data that it captures into the database. The AWR is advantageously designed to be lightweight and to self manage its use of storage space so as to avoid ending up with a repository of performance data that is larger than the database that it is capturing data about. The AWR is configured to automatically capture snapshots of statistics data on a periodic basis as well as purge stale data on a periodic basis. Both the frequency of the statistics data capture and length of time for which data is kept is adjustable. Manual snapshots and purging may also be performed. The AWR captured data allows for both system level and user level analysis to be automatically performed without unduly impacting system performance, e.g., by eliminating or reducing the requirement to repeat the workload in order to diagnose problems.
摘要:
A self-managing workload repository infrastructure (or “AWR” for Automatic workload repository) which is useful for a database server to collect and manage useful system performance statistics. The AWR runs automatically to collect performance data about the operation of the database system, and stores the data that it captures into the database. The collection process is done inside the database, and the collection process is highly efficient as data is retrieved directly from the shared memory of the database kernel. The data captured allows both system level and user level analysis to be performed without unduly impacting system performance, e.g., by eliminating or reducing the requirement to repeat the workload in order to diagnose problems. The AWR is configured to automatically capture snapshots of statistics data on a periodic basis as well as purge stale data on a periodic basis. The captured performance data includes one or more of the top N (e.g., 20 or 30) statistics for activities involving a large set of objects, time-based statistics, cumulative statistics, sampled data and metrics and other data types.
摘要:
An intelligent database infrastructure wherein the management of all database components is performed by and within the database itself by integrating management of various components with a central management control. Each individual database component, as well as the central management control, is self-managing. A central management control module integrates and interacts with the various database components. The database is configured to automatically tune to varying workloads and configurations, correct or alert about bad conditions, and advise on ways to improve overall system performance.