Abstract:
This invention describes a post-recognition procedure to group text recognized by an Optical Character Reader (OCR) from a document image into zones. Once the recognized text and the corresponding word bounding boxes for each word of the text are received, the procedure described dilates (expands) these word bounding boxes by a factor and records those which cross. Two word bounding boxes will cross upon dilation if the corresponding words are very close to each other on the original document. The text is then grouped into zones using the rule that two words will belong to the same zone if their word bounding boxes cross upon dilation. The text zones thus identified are sorted and returned.
Abstract:
This invention describes a post-recognition procedure to group text recognized by an Optical Character Reader (OCR) from a document image into zones. Once the recognized text and the corresponding word bounding boxes for each word of the text are received, the procedure described dilates (expands) these word bounding boxes by a factor and records those which cross. Two word bounding boxes will cross upon dilation if the corresponding words are very close to each other on the original document. The text is then grouped into zones using the rule that two words will belong to the same zone if their word bounding boxes cross upon dilation. The text zones thus identified are sorted and returned.
Abstract:
The invention relates to a slot assembly for a scanner for feeding multiple-sized sheets. The scanner has an input tray for holding a plurality of sheets, a feed path which receives sheets fed from the input tray, and a sheet feeder located in the feed path for feeding sheets. The slot assembly has a first slot for accepting sheets having a first size, the first slot being complementarily sized with respect to the first-sized sheets such that the first-sized sheets are fed by the sheet feeder substantially without being skewed. The slot assembly also has a second slot for accepting sheets having a second size, the second slot being complementarily sized with respect to the second-size sheets such that the second-size sheets are fed by the sheet feeder substantially without being skewed.
Abstract:
The boundaries of a scanned digital document are determined by identifying the largest connected component in the received digital document and assigning the boundaries of the largest connected component as the boundaries of the received digital document or by using a row by row and column by column analysis of the received digital document to identify horizontal and vertical bands in the digital image having pixels with a value opposite to the value of pixels of a background of the received digital document and assigning the horizontal and vertical bands to be the boundaries of the received digital document. These processes may be performed in series or parallel by a processor associated with a scanner that creates the digital document.
Abstract:
A document input device having a gravity feed paper tray is adapted such that the separator pad is mounted on the base, as opposed to the cover, so that when the cover is opened to access the paper path (as during clearing of a paper jam), paper in the paper tray is held in place by the force of the separator pad against the pick roller, even though the cover is open. The separator pad may be attached to a rod that is attached to the base so as to permit the rod and separator pad to remain in place when the cover is in the open position. A leaf spring may be placed between the rod and the separator pad for biasing the separator pad against the pick roller. Also, the cover may include a spring for biasing the separator pad against the pick roller when cover is closed.