摘要:
In one embodiment, a method of parsing an XML data stream comprises receiving an XML data stream containing a namespace prefix and an associated element tag name. The element tag name is associated with an element tag. The namespace prefix and the element tag name are converted into a token that uniquely represents a namespace specification that is associated with the namespace prefix and the element tag. A stack is defined and is configured to receive one or more tokens during parsing of the XML data stream. Parsing of the XML data stream is performed without requiring an XML tree structure comprising an XML document embodied by the XML data stream, to be built.
摘要:
Various features enable an XML data stream to be parsed without the need to build a hierarchical tree structure for the XML document. In the described embodiment, the concept of an element or namespace stack is utilized as a way of organizing parsing activities and maintaining a definable place within the structure of the XML document. Various structures work together with the element or namespace stack to facilitate piecewise parsing of the XML data stream. One structure is a namespace hierarchy that is a collection of namespace objects that each represent a namespace specification that is encountered in the XML data stream. Each object includes a namespace prefix and an associated namespace specification. This structure creates a hierarchical organization that is used for mapping a particular encountered namespace specification into a unique value that represents both the namespace specification and an element tag in which the namespace specification occurs. Another structure is a dictionary collection that contains one or more dictionaries. Each dictionary is specifically associated with a namespace specification that is encountered in the XML data stream. The dictionaries contain entries for one or more tag names and each name's associated unique token. The token is returned and placed on the element stack along with another special value that enables the proper state to be maintained during processing of the XML data stream. The stack also includes a text accumulation buffer that can hold any text that is contained within an element (between the element tags). When an XML element is encountered, the element stack is used to organize parsing activities as the parser makes its way through the XML data stream.
摘要:
Systems for parsing an XML data stream are described. In one embodiment, the system is configured to receive an XML data stream comprising one or more element tags and determine whether an element tag contains a namespace declaration. The system creates one or more namespace objects if an element tag contains one or more respective namespace declarations, each namespace object corresponding to one namespace declaration. The system associates namespace objects with one another if more than one namespace object is created and associates each namespace object with a dictionary that contains one or more entries that are associated with an element tag.
摘要:
In one embodiment, a method of parsing an XML data stream comprises receiving an XML data stream containing a namespace prefix and an associated element tag name. The element tag name is associated with an element tag. The namespace prefix and the element tag name are converted into a token that uniquely represents a namespace specification that is associated with the namespace prefix and the element tag. A stack is defined and is configured to receive one or more tokens during parsing of the XML data stream. Parsing of the XML data stream is performed without requiring an XML tree structure comprising an XML document embodied by the XML data stream, to be built.
摘要:
An architecture for processing an Extensible Markup Language (XML) document converts schema elements in the XML document to data type definition (DTD) objects that can be used to validate data elements in the XML document. The architecture utilizes a node factory design in which an XML parser calls one or more node factory interfaces to construct an in-memory tree representation of an XML document. One of the node factory interfaces is a schema node factory, which is a thin layer that receives calls from the parser to build nodes in the tree representation and translates those calls to calls to a schema builder. The schema builder is a table driven interface that converts the schema elements in the XML document into DTD objects. The DTD objects are then used to validate the data elements as belonging to the schema. If valid, the data elements are used to construct the tree representation.
摘要:
The automatic generation of schemas for XML documents is provided. In an illustrative implementation, a computer readable medium having computer readable instructions to instruct a computing environment to execute one or more inference algorithms is provided. In operation, an XML document is processed according to the computer readable instructions such that the content and tags of the XML document are identified. The XML document is processed according to an inference algorithm, which executes one or more processing rule, and uses the XML document information in conjunction with the rules and operations of the XML schema definition language, to automatically produce a schema for the XML document.
摘要:
Metadata may be stored in, and retrieved from, a scalable, fault-tolerant metadata service. In one example, metadata is divided into partitions, and each partition is served by one or more nodes. For each partition, a first one of the nodes may handle read and write requests, and the other nodes may handle read requests in the event that the first node is down or is experiencing high load. When a request is made with respect to metadata, a metadata server may identify a node, in the partition to which the metadata is assigned, to which the request is to be made. The entity that is making the request then contacts that node, and requests the read or write on the metadata. In a partition, metadata may be replicated between the first node and the other nodes using a log-based replication protocol.
摘要:
The automatic generation of schemas for XML documents is provided. In an illustrative implementation, a computer readable medium having computer readable instructions to instruct a computing environment to execute one or more inference algorithms is provided. In operation, an XML document is processed according to the computer readable instructions such that the content and tags of the XML document are identified. The XML document is processed according to an inference algorithm, which executes one or more processing rule, and uses the XML document information in conjunction with the rules and operations of the XML schema definition language, to automatically produce a schema for the XML document.
摘要:
Metadata may be stored in, and retrieved from, a scalable, fault-tolerant metadata service. In one example, metadata is divided into partitions, and each partition is served by one or more nodes. For each partition, a first one of the nodes may handle read and write requests, and the other nodes may handle read requests in the event that the first node is down or is experiencing high load. When a request is made with respect to metadata, a metadata server may identify a node, in the partition to which the metadata is assigned, to which the request is to be made. The entity that is making the request then contacts that node, and requests the read or write on the metadata. In a partition, metadata may be replicated between the first node and the other nodes using a log-based replication protocol.