摘要:
Techniques are provided for estimating the cardinality of a virtual result table that is produced by executing path-based table functions within a query, such as the XMLTABLE function. Some path-based table functions apply a path expression to input from a base table of XML documents to select rows to produce the result table. Path statistics are collected for the path expressions for the base table. The path statistics are used to estimate the cardinalities of the result table. The estimated cardinality of the result table is useful for estimating costs of query execution plans that are generated for the query.
摘要:
Techniques are provided for efficiently extracting scalar values from binary-encoded XML data. Node information is stored in association with binary-encoded XML data to indicate whether one or more nodes of an XML document are simple or complex. A node is simple if the node has no child elements and no attributes. The node information of a particular node is used to determine whether a particular node, identified in a query, is simple or complex. If the particular node is simple, then the scalar value of the particular node is identified without performing any operations other than possibly converting the scalar value to a non-binary-encoded format or converting the scalar value to a value of a different data type.
摘要:
A database system may utilize XML schema information to increase the efficiency of an XPath streaming evaluation. The database system may access XML schema or translation information during the evaluation of an element, attribute, or value in an XML data source. Based on the XML schema or translation information, the database system may determine matches to an XPath expression without decoding any binary-encoded data in the XML data source. Also, based on the XML schema information, the database may selectively skip or evaluate portions of the XML data source depending on whether those portions are defined in the XML schema so as to possibly contain a match to one or more unmatched steps in the XPath expression. XML schema information may be compiled into a compiled representation of the XPath expression for additional efficiencies.
摘要:
Techniques are provided for ensuring lexical fidelity when an XML document is stored in a binary format. Operations, on the XML data, that would cause the loss of lexical fidelity between the original XML document and the binary-encoded version of the XML document are not performed. Such operations include the removal of unnecessary whitespace characters, certain data type conversions, CRLF normalization, the “collapsing” of two-tag empty elements into a single tag empty element, and the replacing of entity references or numeric character references with another value. An XML schema, to which the XML document conforms, may indicate that the XML document is to be stored in a lexical fidelity mode. Additionally, or alternatively, the database statement that (when executed) causes the XML document to be stored in a binary format may so indicate.
摘要:
Techniques for implementing fast loading of binary XML into a binary XML database repository are provided. A client application reduces the processing burden on the repository by doing pre-processing of the binary XML data prior to loading.
摘要:
A method and apparatus for validating XML documents in a streaming fashion is provided. A streaming validator validates an XML document by comparing the contents of the XML document to an XML schema. Tokens are generated for each element or attribute of the XML schema and for each element or attribute of the XML document using the same generator token function. The elements and attributes of the XML document and XML schema are compared using tokens rather than string comparisons to perform the validation more efficiently.
摘要:
A method and system for evolving XML-schema-based data to conform to an evolved XML schema is disclosed. Based on an existing XML schema and an instance document that is based on the existing XML schema, an XML-schema-independent form of the instance document is generated. Based on a set of specified transformations and the XML-schema-independent form of the instance document, an evolved instance document is generated. The evolved instance document conforms to an evolved XML schema that incorporates changes to the existing XML schema. Techniques described herein are flexible enough to accommodate a wide variety of evolutions to XML schemas.
摘要:
One may increase the efficiency of an XML event-generating process by reducing the number of requests to allocate or deallocate system memory. Such reduction may occur as the result of pre-allocating a memory chunk of sufficient size to contain all of the memory buffers required by a particular event-generating process. Instead of allocating new memory chunks for new memory buffers, an application may store any required buffers within the pre-allocated memory chunk. A sufficient memory size may be estimated by performing the event-generating process on a training set of XML documents. Also, an application may re-use buffers during the process or between different iterations of the process, thus avoiding the need to deallocate and reallocate memory that is essentially being used for the same purpose.
摘要:
Techniques are provided for efficiently extracting scalar values from binary-encoded XML data. Node information is stored in association with binary-encoded XML data to indicate whether one or more nodes of an XML document are simple or complex. A node is simple if the node has no child elements and no attributes. The node information of a particular node is used to determine whether a particular node, identified in a query, is simple or complex. If the particular node is simple, then the scalar value of the particular node is identified without performing any operations other than possibly converting the scalar value to a non-binary-encoded format or converting the scalar value to a value of a different data type.
摘要:
A method and system for the in-place evolution of XML schemas is disclosed. To automatically evolve an existing XML schema, a schema evolver receives both an existing XML schema and an XML document as input. The XML document indicates changes to be made to the existing XML schema. Based on the existing XML schema and the XML document, the schema evolver evolves the existing XML schema into a new XML schema that incorporates the changes indicated in the XML document. According to one aspect, the schema evolver generates one or more SQL statements based on the new XML schema. The SQL statements, when executed by a database server, cause the database server to evolve database structures that were based on the formerly existing XML schema so that the database structures conform to the new XML schema. This is accomplished “in place,” without copying the data in the database structures.