摘要:
A method and apparatus is disclosed for accurately estimating the cost of a database query, including the total computer resources used and estimated elapsed time for the production of a first row and last row of an operator involved in the query and/or the total resources used and elapsed time for returning the overall response to a database query. The method and apparatus accurately accounts for resources used and elapsed time associated with blocking operators, such as sorts and hash joins, which cannot produce a first row until they have completed their operations.
摘要:
The invention provides a mechanism for using statistics, in connection with various database query cost modeling techniques, to more accurately estimate the number of rows and UECs that will be produced by relational operators and predicates in database systems. The ability to accurately estimate the number of rows and UECs returned by a relational operator and/or a predicate is fundamental to computing the cost of a query execution plan. This, in turn, drives the optimizer's ability to select the query plan best suited for the desired performance goal. According to the present invention, histogram statistics are synthesized bottom up from the leaf nodes to the root node of a query tree. Given input statistics in the form of histograms for each operand of a relational operator or predicate, the present inventive method and apparatus merge the input statistics in a way that it simulates the effects of the run time operator on the actual data, so as to produce a predicted row count and UEC for each histogram interval representative of the data that actually will be produced by each such operator or predicate in the query tree. A database query optimizer may use these statistics to select and implement an optimal query plan.
摘要:
The present invention reduces the compile time in a top-down rule based system by identifying the complexity of a query prior to applying a rule to an expression. If the complexity of the query is above a threshold, the present invention determines whether the rule should be applied based upon several factors including the type of rule and the position of the node in the search space. Those rules that need not be applied are randomly pruned at a determined rate that prevents search space explosion and prevents the elimination of large contiguous portions of the search space. Pruned rules are not applied, while those rules that are not pruned are applied.