摘要:
A similar document search method includes a step of extracting a characteristic word candidate as a candidate for a characteristic word from a seeds document including desired retrieval contents, a step of extracting as characteristic words of the seeds document, when the characteristic word candidate extracted by the extracting step is a compound characteristic word including a plurality of characteristic words, the compound characteristic word and constituent characteristic words included in the compound characteristic word from the characteristic word candidate, a step of calculating, according to the characteristic words extracted by the extracting step, similarity between the seeds document and a registration document, and a step of outputting as a retrieval result a result of the similarity calculated by the similarity calculating step.
摘要:
The present invention realize a high speed retrieval performance in a document retrieval system referring to partial data of documents including structured data such as XML documents and electric mails, without providing further memory. The present invention includes storage means for storing documents to be retrieved onto a disk device, a calculation means for calculating an allocated capacity of the memory, and storage means for saving, onto the memory, partial data of the documents stored on the disk device by the calculated allocated capacity of the memory. The present invention also includes a first retrieval means for retrieving partial data stored on the memory, determining means for determining whether or not to retrieve the documents stored on the disk device based on the result from the first retrieval, and a second means for retrieving the documents stored on the disk device based on the result from the above determination.
摘要:
The present invention realize a high speed retrieval performance in a document retrieval system referring to partial data of documents including structured data such as XML documents and electric mails, without providing further memory. The present invention includes storage means for storing documents to be retrieved onto a disk device, a calculation means for calculating an allocated capacity of the memory, and storage means for saving, onto the memory, partial data of the documents stored on the disk device by the calculated allocated capacity of the memory. The present invention also includes a first retrieval means for retrieving partial data stored on the memory, determining means for determining whether or not to retrieve the documents stored on the disk device based on the result from the first retrieval, and a second means for retrieving the documents stored on the disk device based on the result from the above determination.