摘要:
Techniques for fast and scalable generation and aggregation of XML data are described. In an example embodiment, an XML query that requests data from XML documents is received. The XML query is evaluated to determine one or more XML results. For each particular XML result, evaluating the XML query comprises: instantiating a particular data structure that represents the particular XML result, where the particular data structure is encoded in accordance with tags specified in the XML query but does not store the tags; and storing, in the particular data structure, one or more locators that respectively point to one or more fragments in the XML documents, where the particular data structure stores the one or more locators but does not store the one or more fragments. On demand, in response to a request indicating the particular XML result, a serialized representation of the particular XML result is generated based at least on the particular data structure.
摘要:
Techniques for fast and scalable generation and aggregation of XML data are described. In an example embodiment, an XML query that requests data from XML documents is received. The XML query is evaluated to determine one or more XML results. For each particular XML result, evaluating the XML query comprises: instantiating a particular data structure that represents the particular XML result, where the particular data structure is encoded in accordance with tags specified in the XML query but does not store the tags; and storing, in the particular data structure, one or more locators that respectively point to one or more fragments in the XML documents, where the particular data structure stores the one or more locators but does not store the one or more fragments. On demand, in response to a request indicating the particular XML result, a serialized representation of the particular XML result is generated based at least on the particular data structure.
摘要:
Techniques are described for ranking the relevance of electronic documents, such as web pages. An algorithm extracts keywords and recurring phrases from the anchor tag data in electronic documents to define a set of concepts. The algorithm then uses link, concept pairs to create nodes in a graph. In this graph, edges can represent both explicit and implicit conceptual links between nodes. By including conceptual data, the algorithm may model and utilize inter-concept relationships when using graph ranking algorithms. This may improve result accuracy by not only retrieving links which are more authoritative given a users' context, but also by utilizing a larger pool of web pages that are limited by concept-space, rather than keyword-space.