摘要:
A method and system for the in-place evolution of XML schemas is disclosed. To automatically evolve an existing XML schema, a schema evolver receives both an existing XML schema and an XML document as input. The XML document indicates changes to be made to the existing XML schema. Based on the existing XML schema and the XML document, the schema evolver evolves the existing XML schema into a new XML schema that incorporates the changes indicated in the XML document. According to one aspect, the schema evolver generates one or more SQL statements based on the new XML schema. The SQL statements, when executed by a database server, cause the database server to evolve database structures that were based on the formerly existing XML schema so that the database structures conform to the new XML schema. This is accomplished “in place,” without copying the data in the database structures.
摘要:
Techniques are described herein for efficient and scalable processing of complex sets of XML schemas. The techniques described herein provide for reducing duplication of schema elements in volatile memory by building an XML schema in-memory model that stores repeating schema elements in in-memory data structures that are separate from in-memory data structures that store the parent schema elements which logically include or otherwise refer to the repeating schema elements. The techniques described herein also provide for faster generation of an in-memory model of an XML schema by pre-loading, in data structures on persistent storage, of schema elements from dependent XML schemas that are referenced and/or incorporated by the XML schema. The techniques described herein also provide for efficient processing of inter-dependent XML schemas by tracking all unresolved schema elements from dependent XML schemas and freeing the portions of volatile memory, which are used to process schema elements from the dependent XML schemas, as soon as the dependent schema elements being processed are stored in data structures on persistent storage.
摘要:
A modular repository is described, where operational features may be implemented without the need to scan every resource included in the modular repository. A modular repository includes a dedicated set of database objects containing all information needed to access the resources in the repository. For example, the database objects of a modular repository may include those user identifier mappings and ACL mappings, etc., to which metadata in the modular repository refers. A database system may also include a mechanism through which a modular repository may be mounted under a subdirectory of a common directory in the database system. The resources of a modular repository that are mounted under the common directory may be accessed through the common directory. Further, a client may query the resources of any modular repository mounted under the common directory by making the federated repository, represented by the common directory, the context of the query.
摘要:
Techniques for implementing fast loading of binary XML into a binary XML database repository are provided. A client application reduces the processing burden on the repository by doing pre-processing of the binary XML data prior to loading.
摘要:
A method and apparatus for estimating the cost of streaming evaluation of XPaths is provided. Aggregate statistics are maintained by the database server upon initiation of a database function by the database administrator about the nodes of the XML document. Based upon these statistics and the complexity of the particular XPath query, an estimate of the cost of the query, in time and computing resources required, is computed.
摘要:
The approaches described herein provide an efficient way for a database server to process certain kinds of queries over XML data stored in an object-relational database that require the evaluation of a predicate expression with one or more path-based operands. A predicate expression part of a XQuery or SQL WHERE clause that returns a boolean value. A database server first determines whether the query qualifies for this particular kind of optimization, then rewrites the query using an enhanced query operator syntax for specifying the predicate expression to be evaluated. The enhanced query operator subsumes the work of a second path-based query operator, resulting in the suppression of the WHERE EXISTS subquery. The rewritten query operator is used to generate a query execution plan that provides for several query execution optimizations.
摘要:
Techniques for estimating the cost of processing a database statement that includes one or more path expressions are provided. One aspect of cost is I/O cost, or the cost of reading data from persistent storage into memory according to a particular streaming operator. Binary-encoded XML data is stored in association with a synopsis that summarizes the binary-encoded XML data. The synopsis includes skip length information for one or more elements and indicates, for each such element, how large (e.g., in bytes) the element is in storage. The skip length information of a particular element thus indicates how much data may be skipped during I/O if the particular element does not match the path expression that is input to the streaming operator. The skip length information of one or more elements is used to estimate the cost of processing the database statement.
摘要:
The approaches described herein provide an efficient way for a database server to process certain kinds of queries over XML data stored in an object-relational database that require the evaluation of a predicate expression with one or more path-based operands. A predicate expression part of a XQuery or SQL WHERE clause that returns a boolean value. A database server first determines whether the query qualifies for this particular kind of optimization, then rewrites the query using an enhanced query operator syntax for specifying the predicate expression to be evaluated. The enhanced query operator subsumes the work of a second path-based query operator, resulting in the suppression of the WHERE EXISTS subquery. The rewritten query operator is used to generate a query execution plan that provides for several query execution optimizations.
摘要:
Techniques for estimating the cost of processing a database statement that includes one or more path expressions are provided. One aspect of cost is I/O cost, or the cost of reading data from persistent storage into memory according to a particular streaming operator. Binary-encoded XML data is stored in association with a synopsis that summarizes the binary-encoded XML data. The synopsis includes skip length information for one or more elements and indicates, for each such element, how large (e.g., in bytes) the element is in storage. The skip length information of a particular element thus indicates how much data may be skipped during I/O if the particular element does not match the path expression that is input to the streaming operator. The skip length information of one or more elements is used to estimate the cost of processing the database statement.
摘要:
A method for providing optimized data representation of relations for in-memory database query processing is disclosed. The method seeks to optimize the use of the available memory by encoding relations on which the in-memory database query processing is performed and by employing auxiliary structures to maintain performance. Relations are encoded based on data patterns in one or more attribute-columns of the relation and the encoding that is selected is suited to a particular type of data in the column. Members of a set of auxiliary structures are selected based on the benefit the structure can provide and the cost of the structure in terms of the amount of memory used. Encoding of the relations is performed in real-time while query processing occurs, using locks to eliminate conflicts between the query processing and encoding.