摘要:
A document storage and retrieval system is provided with means for storing a document body in the form of image, means for storing text information in the form of a character code string for retrieval, means for executing a retrieval with reference to the text information, and means for displaying a document image relating thereto on a retrieval terminal according to the retrieval result. Such a form of the system is available for retrieving the full contents of a document and also for displaying the document body printed in a format easy to read straight in the form of image. Accordingly, users are capable of retrieving documents with arbitrary words and also capable of reading even such a document as is complicated to include mathematical expressions and charts through a terminal in the form of image, the same as on paper. Further, the invention provides a system wherein the text information for retrieval is extracted automatically from the document image through character recognition. Since a precision of the character recognition has not been satisfactory hitherto, a visual retrieval and correction have been carried out without fail by operators. However, there is no necessity for the operators to attend therefor according to the invention. Thus, the text information for retrieval can be generated at the cost of practical time and money even in case of volumes of documents.
摘要:
A document storage and retrieval system for storing a document body in the form of image, means for storing text information in the form of a character code string for retrieval, apparatus for executing a retrieval with reference to the text information, and apparatus for displaying a document image relating thereto on a retrieval terminal according to the retrieval result. Such a form of the system is available for retrieving the full contents of a document and also for displaying the document body printed in a format easy to read straight in the form of image. Users are capable of retrieving documents with arbitrary words and also capable of reading even such a document as is complicated to include mathematical expressions and charts through a terminal in the form of image, the same as on paper. A system is provided wherein the text information for retrieval is extracted automatically from the document image through character recognition. Since a precision of the character recognition has not been satisfactory hitherto, a visual retrieval and correction have been carried out without fail by operators. However, there is no necessity for the operators to attend therefor.
摘要:
A document storage and retrieval system stores a document body in the form of an image, storing text information in the form of a character code string for retrieval, and executing a retrieval with reference to the text information, followed by displaying a document image relating thereto on a retrieval terminal according to the retrieval result. Such a form of the system is available for retrieving the full contents of a document and also for displaying the document body printed in a format easy to read straight in the form of an image.
摘要:
In a pattern recognition device for recognizing an unknown pattern in accordance with the magnitude of the similarities between the unknown pattern and a plurality of standard patterns, the similarity between the unknown pattern and one of the standard patterns is detected as follows.Similarities are detected at first in respective shifting conditions where the unknown and standard patterns are relatively shifted from each other over the first limited extent, including the condition without the shift. The maximum value of these similarities is then detected. The similarities are further detected in respective shifting conditions where the unknown and standard patterns are relatively shifted from each other over the second extend larger than the first limited extent, when the shifting condition which gave the maximum value is that without relative shift.
摘要:
An image understanding system of this invention uses a grammer describing a document image, and represents the structure of an unknown input image by parsing a statement (the structure of the grammar) written in accordance with this grammer. In other words, the grammer describes an image as substructures and the relative relation between them, and when the substructures and their relative relation are identified in parsing, search is then made whether or not the substructures and their relative relation exist in an unknown input image. The structure of the unknown input image is represented on the basis of the result of this search.
摘要:
For halftone digital image data, an edge portion of the image is extracted for each pixel, and based on a density of each pixel or an average density of each group of a plurality of adjacent or neighboring pixels, there are extracted pixels of a background other than the edge portion of the image. Thereafter, the image data is subdivided into a plurality of blocks each including a predetermined number of pixels. Based on distribution states of the pixels of the block judged to belong to the edge and those of the block judged to belong to the background, a domain recognition is conducted to determine whether the block is a binary block or a halftone block. In addition, for each block, a state of areas in blocks encircling the block is examined such that depending on the state, an expansion/contraction processing is repeatedly achieved a predetermined times for the halftone domain, thereby separating the image data into a binary domain and a halftone domain in a real-time fashion.
摘要:
A document analysis system for determining format information of a document, wherein frames and a relationship of the frames are extracted from an image of an unmarked sample document, characters in a frame of the document are recognized, and an image structure is analyzed based on the frame and the recognized characters.
摘要:
A document analysis system for determining format information of a document, wherein frames and a relationship of the frames are extracted from an image of an unmarked sample document, characters in a frame of the document are recognized, and an image structure is analyzed based on the frame and the recognized characters.
摘要:
An information retrieval system with good human-interface methods to give the system ease-of-use having two distinctive features with the first being visual interface and the second being natural language interpretation. The visual interface provides for visual interaction for local search and natural language interpretation provides for linguistic interaction for global search. The visual interface provides versatile views onto the contents of the knowledge base that the system has, controlling mechanisms for browsing through the knowledge base, a capability of showing relevant information for the users, and a mechanism for editing a query expression that describes information to retrieve. By using the visual interface for information retrieval, the users can easily create query expressions, by consulting and reacting with the system. The natural language interpretation makes use of a conceptual network as a knowledge-base that stores important concepts and relationships among these concepts. Based on knowledge and information represented in the conceptual network, the meaning of a noun phrase or a nominal compound which is a string of adjectives and nouns with some prepositions can be inferred. The inferred interpretation of such a noun phrase is paraphrased into an expression that the information retrieval system can handle. Therefore, the user of the system can simply describe the desired information in a language to get the desired information.
摘要:
A character stream search system using an FSA for determining at a time whether or not a plurality of character streams as search objects exist in a search character stream which undergoes a search operation and which comprises a plurality of characters expressed with codes. In the system, a collation is conducted between the search character stream and a search object character. In a case where there exists a matched search object character as a result of the collation, a state transition is carried out of a predetermined state indicated by the FSA. In a case where there does not exist a matched search object character, a failure processing to effect a state transition to a transistion destination which is determined in association with the configuration of the FSA. The following processing is completed at a count which is a predetermined upper-limit value for each character undergone the search operation.