摘要:
Described herein is a system for “lazy” manifestation of XML documents. In lazy manifestation, only portions of an XML document that contain data of interest (e.g., a particular element or attribute requested by an application) are manifested. The term “manifesting a portion of an XML document” refers to creating an in-memory representation of the portion and incorporating it into existing in-memory representation of an XML document, if any. These portions are referred to herein as a loadable unit. A loadable unit is a set of one or more nodes in an XML document, where when there is need to manifest a node in the set, other nodes in the loadable unit are manifested too. Loadable units may, but not necessarily, correlate to content structures that store the nodes. A loadable unit may be the nodes whose content is contained in a row.
摘要:
A method and system are provided for maintaining an XML index in response to piece-wise modifications on indexed XML documents. The database server that manages the XML index determines which nodes are involved in the piece-wise modifications, and updates the XML index based on only those nodes. Index entries for nodes not involved in the piece-wise modifications remain unchanged.
摘要:
A method, mechanism, and computer program product for storing, accessing, and managing XML data is disclosed. The approach supports efficient evaluation of XPath queries and also improves the performance of data/fragment extraction. The approach can be applied to schema-less documents. The approach is applicable to all database systems and other servers which support storing and managing XML content. In addition, the approach can be applied to store, manage, and retrieve other types of unstructured or semi-structured data in a database system.
摘要:
An XML document can be represented in a compact binary form that maintains all of the features of XML data in a useable form. In response to a request for a modification (e.g., insert, delete or update a node) to an XML document that is stored in the compact binary form, a certain representation of the requested modification is computed for application directly to the binary form of the document. Thus, the requested modification is applied directly to the persistently stored binary form without constructing an object tree or materializing the XML document into a corresponding textual form. Taking into account the nature of the binary form in which the document is encoded, the bytes that actually require change are identified, including identifying where in the binary representation the corresponding actual changes need to be made.
摘要:
A method and system are provided for extracting a valid, self-contained fragment for a node in a XML document stored in a database management system. An XML index is used to identify a location in which XML fragment data corresponding to the node is located. Ancestors of the node are identified and examined for any information needed for the proper interpretation of the fragment. If an ancestor node contains such needed information, this information is patched into the XML fragment to ensure that the fragment is a valid, self-contained XML fragment.
摘要:
A method and apparatus for streaming validation of XML documents is provided. A particular event of a series of events is received. The series of events is generated as an XML document is parsed by a parser, and the received particular event indicates that the parser has encountered a particular part of the XML document. The particular part of the XML document indicated by the particular event is then received. A current validation state for the XML document is determined. The current validation state, which is one of a plurality of validation states for the XML document, indicates a validation type associated with the particular part of the XML document. Based on at least the current validation state, the particular part of the XML document is validated against an XML schema that defines the structure of the XML document.
摘要:
A method and system are provided for determining whether a given path is an indexed path of XML documents stored in a database management system. A finite state machine is built using the path subsetting rules specified by a user. The finite state machine is traversed using the given path. If any accepting states are reached during the traversal of the finite state machine, the given path is determined to matching the path subsetting rules.
摘要:
A method and system are provided for extracting a valid, self-contained fragment for a node in a XML document stored in a database management system. An XML index is used to identify a location in which XML fragment data corresponding to the node is located. Ancestors of the node are identified and examined for any information needed for the proper interpretation of the fragment. If an ancestor node contains such needed information, this information is patched into the XML fragment to ensure that the fragment is a valid, self-contained XML fragment.
摘要:
Techniques are provided for indexing XML documents using path subsetting. According to one embodiment, a PATH table created for storing one row for each indexed node of the XML documents using user-defined criteria. The user-defined criteria are used to determine which nodes of XML documents to included in The PATH TABLE. The PATH table row for a node includes (1) information for locating the XML document that contains the node, (2) information that identifies the path of the node, and (3) information that identifies the position of the node within the hierarchical structure of the XML document that contains the node. Use of the user defined criteria is transparent to any query improves DML indexes overhead costs.
摘要:
Various techniques are provided for facilitating the management of hierarchical data within a relational database system. One such technique involves separating the storage structures used to store data that captures the information about the hierarchy (the “hierarchy structures”), from the storage structures used to store the content of the resources that belong to the hierarchy (the “content structures”). Techniques are also provided for allowing users to customize the metadata attributes associated with resources that belong to the information hierarchy. One technique involves registering XML schemas that specify the metadata attributes desired by a user. Another technique involves storing attributes that do not correspond to any declared field in a “catch-all” column within the resource table. Techniques are provided for determining how to store resources as they are added to the database. According to one technique, the database server searches the data of the resource to find content-type information. If content-type information is found, then the database server consults a content-type to content-structure mapping to determine where to store the content of the resource.