摘要:
Metadata search is enhanced by utilizing relationship data indicating relationships between metadata items. A server generates an index mapping metadata items to terms associated with the metadata items and a graph describing relationships between each of the metadata items. When the server receives a search request, the server locates a candidate set of the metadata items based on the search term(s) and the index. The server performs a link analysis of the graph to determine a relationship score for each metadata item. For each particular metadata item in the candidate set of the metadata items, the server calculates a ranking score based at least on the relationship score for the particular metadata item. The server generates a ranked result set based on comparing the ranking scores for the candidate set of metadata items. The server then provides information indicating the ranked result set in response to the search request.
摘要:
Techniques are disclosed for presenting semi-structured sets of search results comprising two or more differently-structured subsets of search results. The search results are divided into two or more groups of similarly-structured results. The search results are displayed in their respective groups rather than as a single set. Each group is displayed using a different display structure, in an order determined by a group ranking mechanism. The search results within a group ordered by a result ranking mechanism. Techniques are also disclosed for enhancing a display of similarly structured data by emphasizing highly relevant result fields. The highly relevant result fields may be identified based on metadata ranking mechanisms, uniqueness of their constituent values, historical feedback, keyword location, and/or other mechanisms. The fields are emphasized using, without limitation, highlighting, reordering, and filtering of unemphasized fields from the display.
摘要:
Systems, methods, and other embodiments associated with providing query modes for translation-enabled XML documents are described. One method embodiment includes receiving an XPath query to an XML document that may store a translation for a data element. The method embodiment may also include automatically selecting a query mode for the XPath query. The method embodiment may also include querying the XML document using the XPath query and the selected query mode. The query mode may control, at least in part, the operation of an XML database logic.
摘要:
A method and system are provided for extracting a valid, self-contained fragment for a node in a XML document stored in a database management system. An XML index is used to identify a location in which XML fragment data corresponding to the node is located. Ancestors of the node are identified and examined for any information needed for the proper interpretation of the fragment. If an ancestor node contains such needed information, this information is patched into the XML fragment to ensure that the fragment is a valid, self-contained XML fragment.
摘要:
A method and apparatus for streaming validation of XML documents is provided. A particular event of a series of events is received. The series of events is generated as an XML document is parsed by a parser, and the received particular event indicates that the parser has encountered a particular part of the XML document. The particular part of the XML document indicated by the particular event is then received. A current validation state for the XML document is determined. The current validation state, which is one of a plurality of validation states for the XML document, indicates a validation type associated with the particular part of the XML document. Based on at least the current validation state, the particular part of the XML document is validated against an XML schema that defines the structure of the XML document.
摘要:
A declarative mechanism is used to manage large documents within a repository. The large documents are sectioned into subdocuments that are linked together by a parent document. The combination of the parent document and subdocument is referred to as a compound document. There are multiple options for configuring rules to break up a source document into a compound document and naming the subdocuments. The compound documents may be queried using statements that treat the compound document as a single XML document, or the parent document of a subdocument may be queried and treated independently. Access control and versioning can be applied at the finer granularity of the subdocument.
摘要:
A method and apparatus for loading an XML document into memory is provided. A client loads one or more array elements into a first partition of an array that is maintained in memory. Each array element represents an XML element of an XML document. Upon determining that an amount of data maintained in the first partition exceeds a first threshold, the client subsequently loads array elements into a new partition of the array. Upon determining that an amount of data maintained in the memory of the client exceeds a second threshold, the array elements of the least recently used partition are persistently stored in a database without persistently storing the entire XML document. When the last XML element of the XML document is loaded into a partition of the array, that partition is persistently stored in the database, thereby causing the entire XML document to be stored in the database.
摘要:
A method and apparatus are provided for using sibling-counts in XML indices to optimize single-path queries. Using a b-tree XML index with a SQL query logarithmically reduces the number of disk accesses by passing over index entries where it is determined that a match will not be found. However, because certain index entries are passed over, it is impossible to ascertain if a path expression occurs more than once in the XML index, as certain queries sometimes require. This hurdle can be overcome by maintaining a sibling count with each node entry in the XML index. Because the sibling count is stored with the index entry, the index will reveal whether the matching node is single or has other siblings. In additional to re-writing the original query for optimization by use of an XML index, it will be re-written to check for a single-path condition in the index.
摘要:
An approach is provided to partition inter-linked documents into partitions of a database system. In some embodiments, a plurality of documents may be assigned to two or more partitions in the database system, thereby forming a number of inter-partition links between a first partition and a second partition. Here both the first partition and the second partition are in the two or more partitions. First documents may be assigned to the first partition while second documents are assigned to the second partition. Both the first documents and the second documents are in the plurality of documents. It is then determined whether moving one or more of the first documents in the first partition to the second partition reduces the number of inter-partition links between the first partition and the second partition. If that is the case, the one or more of the first documents are moved to the second partition.
摘要:
A method and system are provided for determining whether a given path is an indexed path of XML documents stored in a database management system. A finite state machine is built using the path subsetting rules specified by a user. The finite state machine is traversed using the given path. If any accepting states are reached during the traversal of the finite state machine, the given path is determined to matching the path subsetting rules.