摘要:
Techniques are provided for ensuring lexical fidelity when an XML document is stored in a binary format. Operations, on the XML data, that would cause the loss of lexical fidelity between the original XML document and the binary-encoded version of the XML document are not performed. Such operations include the removal of unnecessary whitespace characters, certain data type conversions, CRLF normalization, the “collapsing” of two-tag empty elements into a single tag empty element, and the replacing of entity references or numeric character references with another value. An XML schema, to which the XML document conforms, may indicate that the XML document is to be stored in a lexical fidelity mode. Additionally, or alternatively, the database statement that (when executed) causes the XML document to be stored in a binary format may so indicate.
摘要:
Techniques are described herein for efficient and scalable processing of complex sets of XML schemas. The techniques described herein provide for reducing duplication of schema elements in volatile memory by building an XML schema in-memory model that stores repeating schema elements in in-memory data structures that are separate from in-memory data structures that store the parent schema elements which logically include or otherwise refer to the repeating schema elements. The techniques described herein also provide for faster generation of an in-memory model of an XML schema by pre-loading, in data structures on persistent storage, of schema elements from dependent XML schemas that are referenced and/or incorporated by the XML schema. The techniques described herein also provide for efficient processing of inter-dependent XML schemas by tracking all unresolved schema elements from dependent XML schemas and freeing the portions of volatile memory, which are used to process schema elements from the dependent XML schemas, as soon as the dependent schema elements being processed are stored in data structures on persistent storage.
摘要:
Techniques are provided for ensuring lexical fidelity when an XML document is stored in a binary format. Operations, on the XML data, that would cause the loss of lexical fidelity between the original XML document and the binary-encoded version of the XML document are not performed. Such operations include the removal of unnecessary whitespace characters, certain data type conversions, CRLF normalization, the “collapsing” of two-tag empty elements into a single tag empty element, and the replacing of entity references or numeric character references with another value. An XML schema, to which the XML document conforms, may indicate that the XML document is to be stored in a lexical fidelity mode. Additionally, or alternatively, the database statement that (when executed) causes the XML document to be stored in a binary format may so indicate.
摘要:
A method and apparatus for validating XML documents in a streaming fashion is provided. A streaming validator validates an XML document by comparing the contents of the XML document to an XML schema. Tokens are generated for each element or attribute of the XML schema and for each element or attribute of the XML document using the same generator token function. The elements and attributes of the XML document and XML schema are compared using tokens rather than string comparisons to perform the validation more efficiently.
摘要:
Techniques are described herein for efficient and scalable processing of complex sets of XML schemas. The techniques described herein provide for reducing duplication of schema elements in volatile memory by building an XML schema in-memory model that stores repeating schema elements in in-memory data structures that are separate from in-memory data structures that store the parent schema elements which logically include or otherwise refer to the repeating schema elements. The techniques described herein also provide for faster generation of an in-memory model of an XML schema by pre-loading, in data structures on persistent storage, of schema elements from dependent XML schemas that are referenced and/or incorporated by the XML schema. The techniques described herein also provide for efficient processing of inter-dependent XML schemas by tracking all unresolved schema elements from dependent XML schemas and freeing the portions of volatile memory, which are used to process schema elements from the dependent XML schemas, as soon as the dependent schema elements being processed are stored in data structures on persistent storage.
摘要:
Various techniques are described hereafter for improving the efficiency of updating XML documents in a content repository, such as a database system. Specifically, techniques are described for updating an XML document by dynamically merging a stream of XML data from the document with update information. Techniques are also described for efficient validation of XML documents. Because of the manner of the updates, specifically because the XML data being updated is in the form of a stream, the database system validates only those portions of the stream of XML data that have been updated. In the alternative, the database system validates that portion of the XML data that is associated with the parent node of the portion of XML data that has been updated.
摘要:
Various techniques are described hereafter for improving the efficiency of updating XML documents in a content repository, such as a database system. Specifically, techniques are described for updating an XML document by dynamically merging a stream of XML data from the document with update information. Techniques are also described for efficient validation of XML documents. Because of the manner of the updates, specifically because the XML data being updated is in the form of a stream, the database system validates only those portions of the stream of XML data that have been updated. In the alternative, the database system validates that portion of the XML data that is associated with the parent node of the portion of XML data that has been updated.
摘要:
A database server exploits the power of compression and a form of storing relational data referred to as column-major format, to store XML documents in shredded form. The column values that are to be stored for shredded XML documents are separately analyzed for a XML document to determine whether to store a particular column in column-major format or row-major format, and what compression technique to use, if any.
摘要:
A database server determines, on an element-level of granularity, what form of VARRAY storage to map collections of elements defined by a XML schema. A collection element may be mapped to an in-line VARRAY or an out-of-line VARRAY. The determination may based on a variety of factors, including the database type mapped to the collection element, database limitations that limit the form storage for certain database types, and annotations (“mapping annotations”) embedded within that XML schema that specifying a database type for database representation of a collection element or a form of VARRAY storage.
摘要:
A method and apparatus for accelerating value-based lookups of XML documents in XQuery is provided. XML indices can help to optimize SQL queries of XML documents stored in object-relational databases. Certain SQL/XML functions such as XMLTABLE( ) use XQuery expressions to query XML documents. Previously, such queries could not use the XML index because the PATH table of the XML index was not defined for XQuery semantics. Techniques described herein extend the XML index for use with queries that require evaluation of XQuery expressions. Consequently, techniques described herein accelerate value-based lookups of XML documents in XQuery by introducing the possibility of an index-assisted evaluation of XQuery expressions.