摘要:
The present invention relates to means and a method executable by a computer system for navigation within a tree structure with leaf nodes representing arbitrary types of objects, i.e. of related data treated as a unit. According to the current teaching a travel point representation step is suggested, wherein after selection of at least one non-leaf node as travel point only the path and non-leaf nodes in said tree structure from said travel point to the root of said tree structure is represented in a tree view area. Moreover the complete sub-tree of said travel point is represented in said tree view area. In addition or alternatively after selection of said travel point, a travel box is represented for said travel point, said travel box representing object identifications of all objects of all leaf nodes in said sub tree of said travel point.
摘要:
This mechanism relates to a method within the area of information mining within a multitude of documents stored on computer systems. More particularly, this mechanism relates to a computerized method of generating a content taxonomy of a multitude of electronic documents. The technique proposed by the current invention is able to improve at the same time the scalability and the coherence and selectivity of taxonomy generation. The fundamental approach of the current invention comprises a subset selection step, wherein a subset of a multitude of documents is being selected. In a taxonomy generation step a taxonomy is generated for that selected subset of documents, the taxonomy being a tree structured taxonomy hierarchy. Moreover this method comprises a routing selection step assigning each unprocessed document to the taxonomy hierarchy based on largest similarity.
摘要:
A method of analyzing a string-pattern includes defining a minimum length (Lmin—1) of substrings (STR_A_B) to be considered; defining a maximum length (Lmax—1) of substrings (STR_A_B) to be considered; with a computer, searching the string-pattern for substrings (STR_A_B) with a length in an interval between the minimum length (Lmin—1) and the maximum length (Lmax—1); counting an occurrence (Occ_A_B) of each substring (STR_A_B) found with a length in the interval between the minimum length (Lmin—1) and the maximum length (Lmax—1); and pruning away a number of the substrings (STR_A_B) that meet one or more criteria. The criteria are selected from the group consisting of (1) being contained inside the maximum substring (STR_A_C) in a subset (SET_A) of substrings (STR_A_B), (2) being shorter than the maximum substring (STR_A_C), (3) occurring with a same frequency as the maximum substring (STR_A_C), and combinations thereof.
摘要:
A method and systems for providing XML data is disclosed. In accordance with an embodiment of the invention, a second data processing system, which is connected to a first data processing system via a network, receives a first request over the network from the first data processing system. The first request comprises specifications for subsequent transfers of XML data from the second data processing system to the first data processing system. The specifications specify for which type of XML documents to be transferred in subsequent transfers to the first data processing system which excerpts of XML data shall be sent. An acknowledge message, sent to the first data processing system from the second data processing system, indicates the latter's ability to provide the excerpts of XML data for the types of XML documents in the subsequent data transfers.
摘要:
System and computer program product for processing a text search query in a collection of documents. A full posting index is generated that has first index terms and a full posting list for each first index term, enumerating occurrences of the first index terms in the documents of the collection. A text search query includes search conditions search terms. The search conditions are translated into conditions on the first index terms to provide translated conditions. At least one short posting index is generated, which includes second index terms and a short posting list for each second index term, enumerating documents in which the second index terms occur. Filter conditions and complementary conditions are generated to represent the translated conditions. The filter conditions approximate the translated conditions, and are processed using the short posting index. The complementary conditions are processed using the full posting index to provide a query result.
摘要:
According to the present invention a method and an infrastructure are provided for processing a text search query in a collection of documents (100). Therefore, a full posting index (200) is generated, stored and updated for each document added to the collection (100). Said full posting index (200) comprising a set of index terms and a full posting list for each index term of said set, enumerating all occurrences of said index term in all documents of the collection (100). In addition to said full posting index (200) at least one additional posting index (400, 500, 600) is generated, stored and updated for each document added to the collection (100). Said additional posting index (400, 500, 600) is related to a defined document part and comprises a set of index terms and a restricted posting list for each index term of said set, enumerating all occurrences of said index term in said document part of all documents of the collection (100). A text search query comprises search conditions on search terms, which are translated into conditions on the index terms of said full posting index (200). Then, said translated conditions of a given text search query are optimized (a) by identifying all conditions of said translated conditions, which are restricted to defined document parts, for which an additional posting index is available, and (b) by re-writing said identified conditions with part restriction as pair conditions on index terms of said additional posting index (400, 500, 600) and the corresponding document part. Thus, said pair conditions can be processed by using only said additional posting index (400, 500, 600).
摘要:
Provided is a method for processing queries in a database in which data records have a parametric object and an extension of a nonparametric data type. A query includes a parametric condition for the parametric object of the data records and a nonparametric condition for the nonparametric extension of the data records. Parametric information of each data record is translated into constructs of the data type of the extension. A parametric result set of data records for the parametric condition is generated. The parametric condition of said query is translated into a filter condition for said constructs of the data type of the extension. The nonparametric condition of said query and said filter condition are employed to generate a nonparametric result set. The parametric result set and the nonparametric result set are joined to obtain a result set.
摘要:
Method and Infrastructure for Processing Queries in a Database According to the present invention a method and an infrastructure are provided for processing queries in a database (1) of data records each comprising at least one parametric object with parametric information and at least one extension of a nonparametric datatype, the query comprising at least one parametric condition for the parametric object of the data records and at least one nonparametric condition for the nonparametric extension of the data records. First, at least parts of the parametric information of each data record are translated into constructs of the datatype of the extension. Processing a query comprises evaluation of a parametric result set (2) of data records for the parametric condition. In order to evaluate a nonparametric result set (5) of data records for the nonparametric condition, the parametric condition of said query is translated into at least one filter condition for said constructs of the datatype of the extension. Then, both the nonparametric condition of said query and said filter condition are considered by evaluating a nonparametric result set (5). Finally, the parametric result set (2) and the nonparametric result set (5) are joined to obtain a result set (4) for the query as a whole.
摘要:
Processing is provided for operating an in-memory database, wherein transaction data is stored by a persistence buffer in an FIFO queue, and update processor subsequently: waits for a trigger; extracts the last transactional data associated with a single transaction of the in-memory database from the FIFO memory queue; determines if the transaction data includes updates to data fields in the in-memory database which were already processed; and if not, then stores the extracted transaction data to a store queue, remembering the fields updated in the in-memory database, or otherwise updates the store queue with the extracted transaction data. The process continues until the extracting is complete, and the content of the store queue is periodically written into a persistent storage device.
摘要:
A method, system and computer program product implementing the method are provided to process a text search query in a collection of documents. A full posting index is generated for the documents in the collection. The full posting index comprises one or more first index terms and a full posting list for each first index term, enumerating the occurrences of the first index term in the documents. In addition to the full posting index, at least one additional posting index is generated for the documents. The additional posting index is related to a defined document part and comprises one or more second index terms and a restricted posting list for each second index term, enumerating all occurrences of the second index term in the document part of the documents of the collection. The text search query is performed using the additional posting index.