摘要:
A system, method, and apparatus for mark recognition in an image of an original document are provided. The method/system takes as input an image of an original document in which at least one designated field is provided for accepting a mark applied by a user (which may or may not have been marked). A region of interest (RoI) is extracted from the image, roughly corresponding to the designated field. A center of gravity (CoG) of the RoI is determined, based on a distribution of black pixels in the RoI. Thereafter, for one or more iterations, the RoI is partitioned into sub-RoIs, based on the determined CoG, where at a subsequent iteration, sub-RoIs generated at the prior iteration serve as the RoI partitioned. Data is extracted from the RoI and sub-RoIs at one or more of the iterations, which allows a representation of the entire RoI to be generated which is useful in classifying the designated field, e.g., as positive (marked) or negative (not marked).
摘要:
A method for separating and categorizing documents includes receiving a scanned batch of documents. The batch includes scanned documents to which document separator stamps have been applied before scanning. Each stamp includes machine recognizable patterns applied on a same page of a document, spaced by a designated field for receiving a user-applied category code. The scanned batch of documents is processed to identify pages that contain a document separator, including identifying at least one of two spaced patterns. For a document page for which a document separator is identified, the the corresponding designated field is located and the category code associated with the designated field identified. The document containing the is separated from other documents in the batch based the identified separator and a document category is assigned to the document, based on the identified category code.
摘要:
A method, apparatus, and hardcopy document are provided. The method provides for separating and categorizing documents and includes receiving a scanned batch of documents. The batch includes a plurality of scanned documents to which document separator stamps have been applied before scanning. Each document separator stamp includes first and second machine recognizable patterns applied on a same page of a document, the first and second patterns being spaced by a designated field for receiving a user-applied category code. The scanned batch of documents is processed to identify pages that contain a document separator, the processing including identifying at least one of the first and second spaced patterns. For each of a plurality of document pages for which a document separator is identified, the method includes locating the corresponding designated field and identifying the category code associated with the designated field. The document containing the identified separator is separated from other documents in the batch based on at least the identified separator and a document category is assigned to the document from a set of document categories, based on the identified category code.