摘要:
Relational database applications such as index selection, histogram tuning, approximate query processing, and statistics selection have recognized the importance of leveraging workloads. Often these applications are presented with large workloads, i.e., a set of SQL DML statements, as input. A key factor affecting the scalability of such applications is the size of the workload. The invention concerns workload compression which helps improve the scalability of such applications. The exemplary embodiment is broadly applicable to a variety of workload-driven applications, while allowing for incorporation of application specific knowledge. The process is described in detail in the context of two workload-driven applications: index selection and approximate query processing.
摘要:
A method for estimating the result of a query on a database having data records arranged in tables. The database has an expected workload that includes a set of queries that can be executed on the database. An expected workload is derived including a set of queries that can be executed on the database. A sample is constructed by selecting data records for inclusion in the sample in a manner that minimizes an estimation error when the data records are acted upon by a query in the expected workload to provide an expected workload to provide an expected result. The query accesses the sample and is executed on the sample, returning an estimated query result. The expected workload can be constructed by specifying a degree of overlap between records selected by queries in the given workload and records selected by queries in the expected workload.
摘要:
The subject disclosure pertains to exploitation of sequence information in a workload for the purpose of performance tuning. An optimal or exhaustive approach to tune sequences by mapping to a shortest path problem is provided as well as powerful techniques that result in optimal or nearly optimal solutions. Also disclosed is greedy approach that is much more efficient than the optimal approach and yet also generates solutions of comparable quality. Further yet, systems and methods are provided that facilitate extraction of sequence information and implementation of recommendations.
摘要:
A method is provided for tuning a database to recommend a set of physical design structures for the database that optimize database performance for a given workload given a total time bound that defines a maximum amount of time that can be spent tuning the database. A cumulative set of recommended structures is maintained and incrementally updated based on tuning that is performed in intervals over portions of the workload. The cumulative set of recommended structures is updated by tuning the database by examining a predetermined portion of the workload during a time slice that is a fraction of the total time bound. At the end of the time slice, a set of recommended structures has been enumerated that is based on the workload portions that have been examined thus far. The set of recommended structures is updated until all queries in the workload have been examined or until the time bound is reached.
摘要:
Systems and methodologies that split a table into a plurality of sub-tables, and vertical partitions. By analyzing an associated work load to determine frequently referenced columns, the subject invention supplies a compromise among various vertical partitioning strategies (e.g., candidate selection for table spilt) via a merging act, such that the table is split optimally for the work load taken as a whole Accordingly, an incoming query can optimally reference only required columns.
摘要:
Internal communications within components of an automated physical database design tool may be conducted in a data description language such as XML. Inputs to and outputs from the automated physical database design tool may also be presented in the data description language (e.g., XML). The communications, inputs and outputs may comply with a schema for the data description language. The schema may be written in a schema language such as XSD. Inputs presented in the data description language may comprise tuning options. Outputs may comprise a proposed physical design for a database and reports.
摘要:
A monitoring component of a database server collects a subset of a query workload along with related statistics. A remote index tuning component uses the workload subset and related statistics to determine a physical design that minimizes the cost of executing queries in the workload subset while ensuring that queries omitted from the subset do not degrade in performance.
摘要:
A database object summarization tool is provided that selects a subset of database objects subject to filtering constraints such as a partial order or optimization of some attribute. A dominance primitive filters out tuples that are dominated according to a partial order constraint by another tuple. A representation primitive selects a representative subset of tuples such than an optimization criteria is met.
摘要:
A framework is provided within a database system for specifying database monitoring rules that will be evaluated as part of the execution code path of database events being monitored. The occurrence of a selected database event triggers a rule that evaluates some parameter of an object related to the event against a condition in the rule. If the condition is met, a specified action is taken that can alter the execution of the database event or database system performance. Lightweight aggregation tables are utilized to enable aggregation of object parameter values so that presently occurring events can be compared to a summary of the object parameter values from previously occurring database events. Signatures are assigned to queries based on the structure of the query plan so that information in the lightweight aggregation tables can be grouped according to query signature.
摘要:
Relational database applications such as index selection, histogram tuning, approximate query processing, and statistics selection have recognized the importance of leveraging workloads. Often these applications are presented with large workloads, i.e., a set of SQL DML statements, as input. A key factor affecting the scalability of such applications is the size of the workload. The invention concerns workload compression which helps improve the scalability of such applications. The exemplary embodiment is broadly applicable to a variety of workload-driven applications, while allowing for incorporation of application specific knowledge. The process is described in detail in the context of two workload-driven applications: index selection and approximate query processing.