摘要:
A method for assisting in the creation of a logical structure model, which stores, from an image in which character strings associated respectively with a plurality of logical elements constituting a logical structure are described, the logical elements, character strings associated with the logical elements, and the logical structure, wherein character strings in an input image and the logical structure among the character strings in the input image are extracted, a logical element is selected among the plurality of logical elements according to the degrees of similarity between the extracted character strings and the character string associated respectively with the plurality of logical elements stored in the logical structure model, a character string associated with the selected logical element and a character string in the input image associated with the logical element based on the logical structure among the extracted character strings in the input image are extracted.
摘要:
An image recognition method is conducted by recognizing logical elements based on a logical structure model set to correspond to the logical structure of an image of individual character strings, collecting information processed with the logical structure model of images of a logical structure, acquiring a recognition result when recognizing an image of a logical structure by processing information collected with a post-update logical structure model,and outputting warning information about the post-update logical structure model to an output unit when a result of the comparison is a non-match.
摘要:
An image recognition method is conducted by recognizing logical elements based on a logical structure model set to correspond to the logical structure of an image of individual character strings, collecting information processed with the logical structure model of images of a logical structure, acquiring a recognition result when recognizing an image of a logical structure by processing information collected with a post-update logical structure model,and outputting warning information about the post-update logical structure model to an output unit when a result of the comparison is a non-match.
摘要:
A character area extracting device includes a reflective and non-reflective area separation unit separating image data into reflective and non-reflective areas, and binarizing the image data by changing a first threshold value when it is inappropriate; a reflective area binarizing unit separating the reflective area into character and background areas, and binarizing it by changing a second threshold value when it is inappropriate; a non-reflective area binarizing unit separating the non-reflective area into the character and background areas, and binarizing it by changing a third threshold value when it is inappropriate; a reflective and non-reflective area separation evaluation unit; and a line extracting unit connecting the character areas of the reflective and non-reflective areas and extracting positional information of the connected character areas in the image data.
摘要:
A character area extracting device includes a reflective and non-reflective area separation unit separating image data into reflective and non-reflective areas, and binarizing the image data by changing a first threshold value when it is inappropriate; a reflective area binarizing unit separating the reflective area into character and background areas, and binarizing it by changing a second threshold value when it is inappropriate; a non-reflective area binarizing unit separating the non-reflective area into the character and background areas, and binarizing it by changing a third threshold value when it is inappropriate; a reflective and non-reflective area separation evaluation unit; and a line extracting unit connecting the character areas of the reflective and non-reflective areas and extracting positional information of the connected character areas in the image data.
摘要:
A document recognizing apparatus includes a display control unit which displays a document data including a character string related to a character string selected by a user, and an area that includes at least a character string of the document data.
摘要:
A logical structure analyzing apparatus includes an extracting unit that extracts word candidates from a form, a first generating unit that classifies each of the word candidates into a group of heading candidates or a group of data candidates to generate, based on positions of the word candidates on the form, first candidate sets each including one heading candidate and one data candidate identifiable by the heading candidate, and a second generating unit that combines the first candidate sets to generate second candidate sets that each include plural heading candidates that differ and one data candidate. The apparatus also includes a removing unit that, based on positions of the heading candidates and the data word candidate in each second candidate set, removes from among the second candidate sets, a determined set including a data item and headings identifying the data item, and an output unit that outputs the determined set.
摘要:
An image recognition apparatus recognizes the correspondence between character strings and logical elements composing a logical structure in an image in which the character strings are described as the logical elements to recognize each logical element. The image recognition apparatus includes outputting means for outputting the recognized logical elements when the correspondence is recognized or re-recognized; first determining means for determining a certain logical element to be correct when input of a determination request to determine the logical element is received from a user; second determining means for determining the correctness of all the logical elements output before the logical element determined by the first determining means and is positioned according to confirmation by the user; and re-recognizing means for re-recognizing the correspondence between logical elements that have not been determined to be correct and the character strings on the basis of the determination content for each logical element.
摘要:
The form data extracting apparatus, even input form does not have a logical structure stored in the generic logical structure DB, by using logical elements in the existing logical structure and a registered form obtained on the basis of (a) the logical structure, (b) pieces of position information of the logical elements, and (c) a relation between the logical elements. A logical element and a logical structure are extracted from the input form, and the extracted logical structure can be defined as a new registered form or a new logical structure.
摘要:
A method for assisting in the creation of a logical structure model, which stores, from an image in which character strings associated respectively with a plurality of logical elements constituting a logical structure are described, the logical elements, character strings associated with the logical elements, and the logical structure, wherein character strings in an input image and the logical structure among the character strings in the input image are extracted, a logical element is selected among the plurality of logical elements according to the degrees of similarity between the extracted character strings and the character string associated respectively with the plurality of logical elements stored in the logical structure model, a character string associated with the selected logical element and a character string in the input image associated with the logical element based on the logical structure among the extracted character strings in the input image are extracted.