摘要:
A method and apparatus for editing a scanned image. The present invention provides for editing of a scanned image in terms of interpretations of graphical objects contained therein. A graphical object can represent a letter, word, line of text, graphic or any other portion of the document image selected by the user. An interpretation embodies a predetermined relationship between graphical objects as well as editing operations that can be performed on the graphical objects. Interpretations belong to one of two classes. A first class, set interpretation, treats graphical objects as an unordered set lying within a document plane. Editing operations in a set interpretation allow a graphical object to be manipulated within in the document plane without disturbing the spatial orientation of other graphical objects. A second class, sequence interpretation, is like a set interpretation except that the set of graphical objects are ordered. An editing operation in a sequence interpretation will typically affect the spatial orientation of other graphical objects in the set of graphical objects. A particular type of sequence interpretation, called text interpretation, allows for manipulation of sets of graphical objects as if they were text.
摘要:
A method and apparatus for applying morphological image criteria that identify image units in an undecoded document image having significant information content, and for retrieving related data that supplements the document either from elsewhere within the document or a source external to the document. The retrieved data can result from character code recognition or template matching of the identified significant image units, or the retrieved data can result directly from an analysis of the morphological image characteristics of the identified significant image units. A reading machine can allow a user to browse and select documents or segments thereof, and to obtain interactive retrieval of documents and supplemental data.
摘要:
A method and apparatus for excerpting and summarizing an undecoded document image, without first converting the document image to optical character codes such as ASCII text, identifies significant words, phrases and graphics in the document image using automatic or interactive morphological image recognition techniques, document summaries or indices are produced based on the identified significant portions of the document image. The disclosed method is particularly adept for improvement of reading machines for the blind.
摘要:
Character level text editing is performed on an image without recognizing characters, by operating on a character-size array obtained from a two-dimensional array defining an image region. A processor, in response to a request for a text editing operation, accesses an edit data structure that includes the image region array and performs the operation. The character-size array is obtained by dividing the image region array when necessary. An image region array that includes more than one line is divided along interline spaces. An image region array that includes one line is divided along intercharacter spaces. Character-size arrays are divided out of larger arrays by finding connected component bounding boxes, and then determining from the bounding boxes whether the connected components are likely to form a character. If so, the connected components are used to obtain the character-size array and spatial data about position, size, and shape of the character. Smaller arrays and spatial data can replace a larger array in the edit data structure. Smaller arrays are obtained only as necessary to perform a requested text editing operation, and if the edit data structure is not otherwise modified, obtaining a smaller array does not necessitate redrawing of the display. In addition to character level editing, a text editing operation can be performed on a sequence of arrays, such as a word, line, or a sequence that begins on one line and ends on another. The spatial data can be used to position arrays after insertion or deletion, to advance a cursor through the text, and to justify a line of arrays. A character-size array can be assigned to a keyboard key, and the key may then be used to insert that array into the text or to request a search for other arrays matching that array.
摘要:
A method and apparatus for processing a document image, using a programmed general or special purpose computer, includes forming the image into image units, and at least one image unit classifier of at least one of the image units is determined, without decoding the content of the at least one of the image units. The classifier of the at least one of the image units is then compared with a classifier of another image unit. The classifier may be image unit length, width, location in the document, font, typeface, cross-section, the number of ascenders, the number of descenders, the average pixel density, the length of the top line contour, the length of the base contour, the location of image units with respect to neighboring image units, vertical position, horizontal inter-image unit spacing, and so forth. The classifier comparison can be a comparison with classifiers of image units of words in a reference table, or with classifiers of other image units in the document. Equivalent classes of image units can be generated, from which word frequency and significance can be determined. The image units can be determined by creating bounding boxes about identifiable segments or extractable units of the image, and can contain a word, a phrase, a letter, a number, a character, a glyph or the like.
摘要:
An existing character, in a text defined in image form by data such as a two-dimensional array, is copied to add a new character to the text. The existing character is found by performing character recognition on a two-dimensional data array defining an image that includes part of the text, such as a page. The array can be obtained from a scanner. A word that is recognized as including characters of the type needed is tested to determine whether it can be divided into the correct number of characters. The word is divided by finding connected components in the part of the array in which the word was found during recognition. The connected components are grouped into sets, each set being likely to be a character. If the word can be correctly divided, character-size arrays for its characters are obtained and saved. One of the arrays for the character type of the new character is selected and used to produce an array for the word in which it is included. The new word's array is then used to produce an array for a line in which the new word replaces an old word. The characters of the new word are spaced according to the spacing of the characters of the old word. The new character is positioned transverse to the line based on the transverse positioning of the existing character. The interword spaces of the line are adjusted. The line's array is then used to produce data defining a modified version of the text in image form.