摘要:
Techniques for estimating a cost of executing a query are provided. A query includes multiple predicates, each of which is associated with a selectivity value that indicates a percentage of input that satisfies the condition of the corresponding predicate. The selectivity values are used to determine an estimated cost of executing the query. In one technique, a group of multiple predicates of a query are treated as a single predicate. Thus, a single selectivity value, rather than multiple selectivity values, is determined for that group. In a related technique, instead of determining a selectivity value of a predicate in isolation with respect to other predicates of a query, the selectivity value of a set of one or more predicates in a query is generated based on other predicates in the query.
摘要:
Techniques for estimating a cost of executing a query are provided. A query includes multiple predicates, each of which is associated with a selectivity value that indicates a percentage of input that satisfies the condition of the corresponding predicate. The selectivity values are used to determine an estimated cost of executing the query. In one technique, a group of multiple predicates of a query are treated as a single predicate. Thus, a single selectivity value, rather than multiple selectivity values, is determined for that group. In a related technique, instead of determining a selectivity value of a predicate in isolation with respect to other predicates of a query, the selectivity value of a set of one or more predicates in a query is generated based on other predicates in the query.
摘要:
A query is rewritten to combine streaming evaluation and XML index evaluation. The query is rewritten to include a streaming operator (e.g. function) that, when executed, uses streaming evaluation. Further, the query is rewritten so that XML index evaluation of a path expression also produces location information that identifies the location of a node within an XML document. The streaming operator is able to exploit the location information to begin and end scanning rather than scanning the entire XML document.
摘要:
A query is rewritten to combine streaming evaluation and XML index evaluation. The query is rewritten to include a streaming operator (e.g. function) that, when executed, uses streaming evaluation. Further, the query is rewritten so that XML index evaluation of a path expression also produces location information that identifies the location of a node within an XML document. The streaming operator is able to exploit the location information to begin and end scanning rather than scanning the entire XML document.
摘要:
Techniques for processing a query that includes a path expression are provided. A query processor determines whether an XML index may be used to execute the query instead of having to scan multiple XML documents. The query is parsed and normalized, which results in multiple normalized path expressions that are based on the original path expression. If the XML index is a path-subsetted index, then the query processor generates annotated path expressions based on the normalized path expressions. The query processor determines whether each of the annotated path expressions is satisfiable by the path-subsetted XML index. If so, then the XML index is used to process the query.
摘要:
The approaches described herein provide an efficient way for a database server to process certain kinds of queries over XML data stored in an object-relational database that require the evaluation of a predicate expression with one or more path-based operands. A predicate expression part of a XQuery or SQL WHERE clause that returns a boolean value. A database server first determines whether the query qualifies for this particular kind of optimization, then rewrites the query using an enhanced query operator syntax for specifying the predicate expression to be evaluated. The enhanced query operator subsumes the work of a second path-based query operator, resulting in the suppression of the WHERE EXISTS subquery. The rewritten query operator is used to generate a query execution plan that provides for several query execution optimizations.
摘要:
A database system may perform a streaming evaluation of an XPath expression by utilizing an XPath evaluation component in tandem with an XML event-streaming component. For a more optimal filtered streaming evaluation, the XML event-streaming component may provide an interface whereby the evaluation component sends certain criteria to the event-streaming component when requesting an XML event. The criteria may be based on a next unmatched step in the XPath expression. In response to the request for an XML event, the event-streaming component may only return events that match the criteria. The evaluation component may be, for example, a compiled state machine for the XPath expression. The criteria may be pre-compiled for each possible state in the state machine. The event-streaming component may also utilize the criteria along with schema information to skip parsing of certain segments of XML data.
摘要:
The approaches described herein provide an efficient way for a database server to process certain kinds of queries over XML data stored in an object-relational database that require the evaluation of a predicate expression with one or more path-based operands. A predicate expression part of a XQuery or SQL WHERE clause that returns a boolean value. A database server first determines whether the query qualifies for this particular kind of optimization, then rewrites the query using an enhanced query operator syntax for specifying the predicate expression to be evaluated. The enhanced query operator subsumes the work of a second path-based query operator, resulting in the suppression of the WHERE EXISTS subquery. The rewritten query operator is used to generate a query execution plan that provides for several query execution optimizations.
摘要:
Techniques are provided for estimating the cardinality of a virtual result table that is produced by executing path-based table functions within a query, such as the XMLTABLE function. Some path-based table functions apply a path expression to input from a base table of XML documents to select rows to produce the result table. Path statistics are collected for the path expressions for the base table. The path statistics are used to estimate the cardinalities of the result table. The estimated cardinality of the result table is useful for estimating costs of query execution plans that are generated for the query.
摘要:
Techniques for estimating the cost of processing a database statement that includes one or more path expressions are provided. One aspect of cost is I/O cost, or the cost of reading data from persistent storage into memory according to a particular streaming operator. Binary-encoded XML data is stored in association with a synopsis that summarizes the binary-encoded XML data. The synopsis includes skip length information for one or more elements and indicates, for each such element, how large (e.g., in bytes) the element is in storage. The skip length information of a particular element thus indicates how much data may be skipped during I/O if the particular element does not match the path expression that is input to the streaming operator. The skip length information of one or more elements is used to estimate the cost of processing the database statement.