摘要:
A method and system for Extensible Markup Language (XML) schema validation, includes: loading an XML document into a runtime validation engine, where the runtime validation engine includes an XML schema validation parser; loading an annotated automaton encoding (AAE) for an XML schema definition into the XML schema validation parser; and validating the XML document against the XML schema definition by the XML schema validation parser utilizing the annotated automaton encoding. Each XML schema definition is compiled once into the AAE format, rather than being compiled each time an XML document is validated, and thus significant time is saved. The code for the runtime validation engine is fixed and does not vary depending on the XML schema definition, rather than varying for each XML schema definition, and thus space overhead is minimized. Flexibility in the validation process is provided without compromising performance.
摘要:
A storage of nodes of hierarchically structured data uses logical node identifiers to reference the nodes stored within and across record data structures. A node identifier index is used to map each logical node identifier to a record identifier for the record that contains the node. When a sub-tree is stored in a separate record, a proxy node is used to represent the sub-tree in the parent record. The mapping in the node identifier index reflects the storage of the sub-tree nodes in the separate record. Since the references between the records are through logical node identifiers, there is no limitation to the moving of records across pages, as long as the indices are updated or rebuilt to maintain synchronization with the resulting data pages. This approach is highly scalable and has a much smaller storage consumption than approaches that use explicit references between nodes.
摘要:
A method and system for providing a scalable storage scheme for native hierarchically structured data of relational tables, includes a base table with indicator columns with information pertaining to hierarchically structured data of a document, data tables for storing the hierarchically structured data corresponding to the indicator columns, and node identifier indexes corresponding to the data tables for mapping between the indicator columns and the hierarchically structured data in the data tables. In an embodiment, actual data for each hierarchically structured data (such as XML) column is stored in a separate data table, and each data table has a separate node identifier index. The node identifier index is searched with a key containing the document identifier and a logical node identifier is used, and a record identifier of a record in the data table containing the node assigned the logical node identifier is retrieved.
摘要:
A method, apparatus, and article of manufacture for optimizing a query in a computer system. During compilation of the query, a GROUP BY clause with one or more GROUPING SETS, ROLLUP or CUBE operations is maintained in its original form until after query rewrite. The GROUP BY clause with the GROUPING SETS, ROLLUP or CUBE operations is then translated into a plurality of levels having one or more grouping sets. After compilation of the query, a grouping sets sequence is dynamically determined for the GROUP BY clause with the GROUPING SETS, ROLLUP or CUBE operations based on intermediate grouping sets, in order to optimize the grouping sets sequence. The execution of the grouping sets sequence is optimized by selecting a smallest grouping set from a previous one of the levels as an input to a grouping set on a next one of the levels. Finally, a UNION ALL operation is performed on the grouping sets.
摘要:
Disclosed is method for processing an aggregate function. Rows that contain a reference to intermediate result structures are grouped to form groups. For each group, aggregate element structures are formed from the intermediate result structures and, if the aggregate function specifies ordering, the aggregate element structures are sorted based on a sort key.
摘要:
An XML schema is compiled into an annotated automaton encoding, which includes a parsing table for structural information and annotation for type information. The representation is extended to include a mapping from schema types to states in a parsing table. To validate a fragment against a schema type, it is necessary simply to determine the state corresponding to the schema type, and start the validation process from that state. When the process returns to the state, fragment validation has reached successful completion. This approach is more efficient than a general tree representation. Only the data representation of the schema information is handled, making it much easier than manipulating validation parser code generated by a parser generator. In addition, only one representation is needed for schema information for both document and fragment validation. This approach also provides a basis for incremental validation after update.
摘要:
An XML schema is compiled into an annotated automaton encoding, which includes a parsing table for structural information and annotation for type information. The representation is extended to include a mapping from schema types to states in a parsing table. To validate a fragment against a schema type, it is necessary simply to determine the state corresponding to the schema type, and start the validation process from that state. When the process returns to the state, fragment validation has reached successful completion. This approach is more efficient than a general tree representation. Only the data representation of the schema information is handled, making it much easier than manipulating validation parser code generated by a parser generator. In addition, only one representation is needed for schema information for both document and fragment validation. This approach also provides a basis for incremental validation after update.
摘要:
An XML schema is compiled into an annotated automaton encoding, which includes a parsing table for structural information and annotation for type information. The representation is extended to include a mapping from schema types to states in a parsing table. To validate a fragment against a schema type, it is necessary simply to determine the state corresponding to the schema type, and start the validation process from that state. When the process returns to the state, fragment validation has reached successful completion. This approach is more efficient than a general tree representation. Only the data representation of the schema information is handled, making it much easier than manipulating validation parser code generated by a parser generator. In addition, only one representation is needed for schema information for both document and fragment validation. This approach also provides a basis for incremental validation after update.
摘要:
Type annotation record information storage for annotated automaton encoding for high-performance XML schema validation is optimized in a space efficient aspect. Subsequent to type annotation record information organization, type annotation records are used for type annotation of validated XML documents, either by implementing annotation records and type annotation part of an algorithm only, or by skipping one or more validation steps in a full validation implementation. Given a schema context, a type annotation may be performed for a validated XML fragment as opposed to an entire document. In addition, default features such as attribute and type are supported.
摘要:
A method and system for evaluating a path query are disclosed. The path query corresponds to a query tree including a plurality of query nodes. At least one query node corresponds to at least one predicate and is at a level. The predicate(s) are evaluated for previous query node(s). The method and system include scanning data nodes of a document and determining if the data nodes match the query nodes. The method and system also include placing data related to the data node in match stacks corresponding to matched query nodes. The data for the query node(s) include attribute(s) corresponding to the predicate(s). The method and system further include propagating a matching of the at least one query node backward to a matching of the at least one previous query node.