Abstract:
Resources are typically stored in homogenous data structures by shredding resource data into database tables destroying a native format of the resources. Typical approaches to indexing the resources rely on users indicating properties that should be indexed, using full text searches to create resource index documents, and other such labor and computation intensive processes. Functionality can be implemented to dynamically generate the resource index documents based on resource properties with minimal user input. The resource index documents can be in a common format to facilitate access to resources stored in heterogeneous native resource formats.
Abstract:
An apparatus, system, and method are disclosed for efficient content indexing of streaming XML document content. A forest generator generates an XML pattern forest from a set of structured index path expressions, the XML pattern forest includes trees and twigs generated from structured index path expressions uniquely associated with a namespace indicator for an XML node. The XML node is identified in a stream of at least one XML document. A comparison module compares the XML node to nodes of trees and twigs of the XML pattern forest. A determination module determines a match between the XML node and an index node in one of a tree and a twig of the XML pattern forest. The index node has a path from an ancestor node to the index node that matches the axis steps of at least one of the structured index path expressions. A storage module stores an index entry for the XML node in response to the determined match, the index entry includes a XML document identifier, an XML node name, a namespace indicator for the XML node, and XML node content.
Abstract:
An apparatus, system, and method are disclosed for efficient content indexing of streaming XML document content. A forest generator generates an XML pattern forest from a set of structured index path expressions, the XML pattern forest includes trees and twigs generated from structured index path expressions uniquely associated with a namespace indicator for an XML node. The XML node is identified in a stream of at least one XML document. A comparison module compares the XML node to nodes of trees and twigs of the XML pattern forest. A determination module determines a match between the XML node and an index node in one of a tree and a twig of the XML pattern forest. The index node has a path from an ancestor node to the index node that matches the axis steps of at least one of the structured index path expressions. A storage module stores an index entry for the XML node in response to the determined match, the index entry includes a XML document identifier, an XML node name, a namespace indicator for the XML node, and XML node content.
Abstract:
Resources are typically stored in homogenous data structures by shredding resource data into database tables destroying a native format of the resources. Typical approaches to indexing the resources rely on users indicating properties that should be indexed, using full text searches to create resource index documents, and other such labor and computation intensive processes. Functionality can be implemented to dynamically generate the resource index documents based on resource properties with minimal user input. The resource index documents can be in a_common format to facilitate access to resources stored in heterogeneous native resource formats.