摘要:
A method for processing a plurality of input images containing variable content that is filled into respective, fixed templates. The method includes comparing the images to collect a group of the images having a high degree of similarity therebetween, and combining the images in the group so as to distinguish the variable content from a fixed portion common to a preponderant number of the images in the group. The fixed portion is processed to reconstruct the fixed template that is common to at least some of the images among the preponderant number, and information is extracted from the images using the reconstructed template.
摘要:
A method for automated coding of a text phrase relative to a catalog of codes. The method includes finding a plurality of the codes that are candidates for coding of the phrase and identifying a category to which one or more of the candidate codes belong. The phrase is conveyed together with the one or more candidate codes in the identified category to a human operator specialized in the identified category, for verification by the operator of one of the candidate codes in the category for assignment to the phrase.
摘要:
A method for automatic sorting includes receiving an item in a sequence of items to be sorted, each such item marked with a respective machine-readable identifying code and with respective characters in a location relative to the code that varies from one item to another in the sequence. A position of the code on the item is determined and, responsive to the position of the code, the location of the characters on the item is found. The characters are processed to determine a destination of the item.
摘要:
A method for compressing a digitized image of a document using optical character recognition (OCR). The method includes performing optical character recognition (OCR) on the digitized image, identifying, based, at least in part, on a result of the performing step, a plurality of classes of characters comprised in the image, each the class of characters having an associated character value and comprising at least one character, pruning each class of characters, thereby producing information describing the plurality of classes of characters and a residual image, and utilizing the information describing the plurality of classes of characters and the residual image as a compressed digitized image in further processing. Related methods and apparatus are also disclosed.
摘要:
A method for locating a structured field in a gray-scale image of an object, including choosing a plurality of anchor points in the image, each anchor point having a gray-scale value associated therewith. For each anchor point there is determined a horizontal variation dependent on a difference between the gray-scale value of the anchor point and the gray-scale value of a horizontally neighboring anchor point, and there is also determined a vertical variation dependent on a difference between the gray-scale value of the anchor point and the gray-scale value of a vertically neighboring anchor point. Those anchor points whose vertical and horizontal variations obey a first or a second predefined condition are defined as vertically or horizontally dominant respectively. One or more kernels are defined in the image, each such kernel comprising a group of anchor points n predetermined mutual proximity and satisfying a third predefined condition relating the number of vertically-dominant and horizontally-dominant anchor points in the group. The structured field in the image is located using one or more kernels.