摘要:
The present invention provides a document processing device including: an inputting unit that inputs page image data corresponding to images of pages of a document; an extracting unit that analyzes the page image data input by the inputting unit, specifies the content of each item contained in the document corresponding to that page image data, and extracts item data, the item data being character strings expressing that content; a generating unit that links the item data extracted by the extracting unit and generates name data, the name data being a character string expressing a name to be attached to the document; and a writing unit that associates the name data generated by the generating unit with the page image data input by the inputting unit and writes the name data and the page image data to a memory.
摘要:
The present invention provides an associate document retrieving apparatus capable of associate document retrieval which reflects the relation among keywords connected by logical operators in a retrieval expression. In the apparatus, a document information storing element associates each of the documents with a keyword extracted from the document and stores the associated documents. A retrieval expression obtaining element receives a retrieval expression containing retrieval keywords that may be connected by logical operators. A number of documents calculating element specifies objective keywords from within the extracted keywords stored in the document information storing element and calculates several numbers of different kinds of documents. A degree of similarity determining element determines the degree of similarity between the retrieval expression received by the retrieval expression obtaining element and each of the objective keywords in accordance with a relationship between several numbers of documents calculated by the number of documents calculating element. A degree of association determining element obtains associate document information of a document containing any of the objective keywords and determines the degree of association between the retrieval expression and each of the documents based on the degree of similarity for each of the objective keywords and the associate document information.