摘要:
A virtual regulator monitors and manages a plurality of database systems in a domain. Each of the database systems is tuned for a particular type of workload, and the virtual regulator or multiple virtual regulators running in parallel routes a set of one or more queries to a particular database system within the domain based on a cost function for each database system.
摘要:
A computer-implemented apparatus, method, and article of manufacture manage a plurality of database systems and perform data maintenance tasks in a data warehouse system. A domain includes a plurality of database systems. A virtual regulator manages the domain, detects a request to invoke a data maintenance task on a first system in the domain, routes the data maintenance task, for execution, to a second system in the domain, and applies results from the data maintenance task (executed by the second system) to the first system.
摘要:
Optimizing the execution of a query in a multi-database system includes identifying a region within a table, the table being referenced in the query. The region is stored on data-storage devices on first and second system databases in the multi-database system. A first access plan for the query is developed, the first access plan comprising accessing the version of the region stored on the first system database. A second access plan for the query is developed, the second access plan comprising accessing the version of the region stored on the second system database. A selection is made between the first access plan and the second access plan to execute the query. The query is executed using the selected access plan to produce a result.
摘要:
Optimizing the execution of a query in a multi-database system includes identifying a region within a table, the table being referenced in the query. The region is stored on a data-storage device on a first of the system databases in the multi-database system. The region is stored on a data-storage device on a second of the system databases in the multi-database system, the second system database being a different system database than the first system database. A first access plan for the query is developed, the first access plan comprising accessing the version of the region stored on the first system database. A second access plan for the query is developed, the second access plan comprising accessing the version of the region stored on the second system database. A selection is made between the first access plan and the second access plan to execute the query. The query is executed using the selected access plan to produce a result. The result is stored.
摘要:
A system, method, and computer-readable medium that facilitate efficient use of cache memory in a massively parallel processing system are provided. A residency time of a data block to be stored in cache memory or a disk drive is estimated. A metric is calculated for the data block as a function of the residency time. The metric may further be calculated as a function of the data block size. One or more data blocks stored in cache memory are evaluated by comparing a respective metric of the one or more data blocks with the metric of the data block to be stored. A determination is then made to either store the data block on the disk drive or flush the one or more data blocks from the cache memory and store the data block in the cache memory. In this manner, the cache memory may be more efficiently utilized by storing smaller data blocks with lesser residency times by flushing larger data blocks with significant residency times from the cache memory. The disclosed cache management mechanisms are effective for many workloads and are adaptable to various database usage scenarios without requiring detailed studies of the particular data demographics and workload.
摘要:
A system, method, and computer-readable medium that facilitate efficient use of cache memory in a massively parallel processing system are provided. A residency time of a data block to be stored in cache memory or a disk drive is estimated. A metric is calculated for the data block as a function of the residency time. The metric may further be calculated as a function of the data block size. One or more data blocks stored in cache memory are evaluated by comparing a respective metric of the one or more data blocks with the metric of the data block to be stored. A determination is then made to either store the data block on the disk drive or flush the one or more data blocks from the cache memory and store the data block in the cache memory. In this manner, the cache memory may be more efficiently utilized by storing smaller data blocks with lesser residency times by flushing larger data blocks with significant residency times from the cache memory. The disclosed cache management mechanisms are effective for many workloads and are adaptable to various database usage scenarios without requiring detailed studies of the particular data demographics and workload.
摘要:
A computer-implemented apparatus, method, and article of manufacture manage a plurality of database systems and perform data maintenance tasks in a data warehouse system. A domain includes a plurality of database systems. A virtual regulator manages the domain, detects a request to invoke a data maintenance task on a first system in the domain, routes the data maintenance task, for execution, to a second system in the domain, and applies results from the data maintenance task (executed by the second system) to the first system.
摘要:
Optimizing the execution of a query in a multi-database system includes identifying a region within a table, the table being referenced in the query. The region is stored on data-storage devices on first and second system databases in the multi-database system. A first access plan for the query is developed, the first access plan comprising accessing the version of the region stored on the first system database. A second access plan for the query is developed, the second access plan comprising accessing the version of the region stored on the second system database. A selection is made between the first access plan and the second access plan to execute the query. The query is executed using the selected access plan to produce a result.
摘要:
A system includes a multi-system database management system having a plurality of database systems. An index selection subsystem combines sets of query information from respective ones of the plurality of database systems into a workload. The index selection subsystem then generates candidate indexes from the workload, and selects recommended indexes from the candidate indexes based on one or more criteria.
摘要:
A computer-implemented method, apparatus and article of manufacture for optimizing a database query. A query execution plan for the database query is generated using estimated cost information; one or more steps of the query execution plan are executed to retrieve data from a database stored on the computer system. Actual cost information is generated for each of the executed steps, and the estimated cost information is re-calculated using the actual cost information. One or more resource allocation rules defined on one or more steps of the query execution plan are executed, based on the estimated cost information, wherein the resource allocation rules include one or more defined actions. The estimated cost information may be re-calculated using the actual cost information when confidence in the estimated cost information is low, but the estimated cost information may not be re-calculated when confidence in the estimated cost information is high. In addition, the estimated cost information may be re-calculated using the actual cost information, only when the step has one or more resource allocation rules defined thereon.