摘要:
In registering operation of a document to be searched for, a document identifier management table for managing a range of a document identifier stored for each page and a page identifier of the page is created, and an individual-search-server's search range management table for managing the range of the document identifier in charge of each search server is created. In searching operation of each search server of the document to be searched for, the individual-search-server's search range management table is referred to acquire a range of the allocated document identifier. For each index key forming a query term specified as a query condition, the document identifier management table is referred to to acquire the page identifier storing the document identifier of the allocated range. The searching operation is carried out by referring to a page shown by the acquired page identifier.
摘要:
The technology for changing the nodes in an information retrieving system using a computer. When information items are registered by allocating to n nodes, steps are used to extract index information as a set of pairs of index keys of information items and addresses of information items, divide the index information into m (m>n) buckets and produce a partial inverted file to be closed within each of the buckets. Here, m and n are respectively integers of 1 (one) or above. When the allocation of the search-targeted ranges to the nodes is altered, the allocation to the buckets to each of the nodes is changed, and the partial inverted file of each bucket and the inverted file of the existing indexes are merged to produce new indexes, so that the indexes can be produced and updated with high speed.
摘要:
The purpose of the invention is to provide a log management computer that shortens log search time while reducing log storage volume. The log management computer manages a log acquired from a log generating system that generates the log, which is an operation record. The log management computer is characterized by: extracting from a log message contained in the log, both a common portion that is common with another log message and a different portion that is different from another log message; storing the extracted common portion in common portion information of a storage area; storing the extracted different portion in different portion information of the storage area; and if a search request containing a search condition is received, searching for a log message that matches the search condition.
摘要:
The present invention realize a high speed retrieval performance in a document retrieval system referring to partial data of documents including structured data such as XML documents and electric mails, without providing further memory. The present invention includes storage means for storing documents to be retrieved onto a disk device, a calculation means for calculating an allocated capacity of the memory, and storage means for saving, onto the memory, partial data of the documents stored on the disk device by the calculated allocated capacity of the memory. The present invention also includes a first retrieval means for retrieving partial data stored on the memory, determining means for determining whether or not to retrieve the documents stored on the disk device based on the result from the first retrieval, and a second means for retrieving the documents stored on the disk device based on the result from the above determination.
摘要:
In registering operation of a document to be searched for, a document identifier management table for managing a range of a document identifier stored for each page and a page identifier of the page is created, and an individual-search-server's search range management table for managing the range of the document identifier in charge of each search server is created. In searching operation of each search server of the document to be searched for, the individual-search-server's search range management table is referred to acquire a range of the allocated document identifier. For each index key forming a query term specified as a query condition, the document identifier management table is referred to to acquire the page identifier storing the document identifier of the allocated range. The searching operation is carried out by referring to a page shown by the acquired page identifier.
摘要:
The technology for changing the nodes in an information retrieving system using a computer. When information items are registered by allocating to n nodes, steps are used to extract index information as a set of pairs of index keys of information items and addresses of information items, divide the index information into m (m>n) buckets and produce a partial inverted file to be closed within each of the buckets. Here, m and n are respectively integers of 1 (one) or above. When the allocation of the search-targeted ranges to the nodes is altered, the allocation to the buckets to each of the nodes is changed, and the partial inverted file of each bucket and the inverted file of the existing indexes are merged to produce new indexes, so that the indexes can be produced and updated with high speed.
摘要:
A retrieval apparatus 100 for searching document data comprises a document storage area 141 for storing documents to be searched and a document management table 142 for storing a data size of a document such that the data size is associated with a document ID for identifying the document. The retrieval apparatus 100 reads out from the document management table data sizes of documents to be searched, and calculates a retrieval document size by adding up the read out data sizes, and calculates an estimated time t1 taken for a retrieval process by the index scan method and an estimated time t2 taken for the retrieval process by the text scan method, based on the retrieval document size. The retrieval apparatus 100 compares the estimated times t1 and t2, and decides which method to use for a retrieval process, the index scan method or the text scan method.