Abstract:
A document authentication method using block-by-block image comparison is disclosed. An image of an original document and an image of a target document are each segmented into multiple blocks corresponding to paragraphs of text. A first block in the original image is used to search the target image to find a corresponding first block using a cross-correlation method. The position mapping for the first block of the target image is calculated and alterations are detected. Then, for each subsequent block of the original image, a corresponding block of the target document is identified based on the position of the subsequent block of the original image relative to the first block of the original image and the position mapping for the first block of the target image. The corresponding subsequent blocks of the original and target images are compared to detect alterations using a method other than cross-correlation.
Abstract:
A method and program for encoding and decoding color barcodes to increase their data capacity. The encoding steps include determining a shape and a color for each data cell to encode digital data, wherein a combination of the shape and the color for the data cell is chosen from a plurality of combinations of shapes and colors in accordance with a value of the digital data to be encoded, and coloring a subset of the plurality of pixels in each data cell in accordance with the shape and the color for the data cell determined above. The decoding steps include segmenting the data cells in a color barcode, recognizing a shape formed by a subset of pixels in each data cell and the color of the shape, and obtaining digital data from a combination of the recognized shape and color in each data cell.
Abstract:
A method for binarizing a scanned document images containing gray or light colored text printed with halftone patterns. The document image is initially binarized and connected image components are extracted from the initial binary image as text characters. Each text character is classified as either a halftone text character or a non-halftone text character based on an analysis of its topology features. The topology features may be the Euler number of the text character; a text character with a Euler number below −2 is classified as halftone text. The gray-scale document image is then divided into halftone text regions containing only halftone text characters and non-halftone text regions. Each region is binarized using its own pixel value statistics. This eliminates the influence of black text on the threshold values for binarizing halftone text. The binary maps of the regions are combined to generate the final binary map.
Abstract:
A method of generating a self-authenticating printed document and authenticating the printed document. The back side of the printed document contains 2d barcode which encode extracted features of the document content. The features are hashed into a hash code, converted to a barcode stamp element, and transformed into a hierarchical barcode stamp by repeating the stamp element. The hierarchical barcode stamp is printed as a gray background pattern on the front side of the same sheet of printed document. To authenticate the printed document, the barcodes on the back side are read to extract the document features. The features are hashed into a hash code and compared to the hash code extracted from the hierarchical barcode stamp on the front side of the document to detect any alterations of the back side barcodes. Further, the document features extracted from the front and back sides of the document are compared.
Abstract:
A method implemented in a fax machine for analyzing a received fax to determine whether it is an auto-reply fax. Auto-reply faxes are handled differently from other faxes to avoid unnecessary printing. The analysis method includes: determining whether the sender of the received fax is the same as the receiver of a fax sent by the fax machine within a predefined time period in the past; determining whether the received fax contains only one or two pages; extracting text from the image of the received fax using OCR; and detecting the presents of certain keywords in the extracted text which indicate an auto-reply or received status. These determination and detection results are combined to determine whether the received fax is an auto-reply. Auto-reply faxes may be saved but not automatically printed, or forwarded to an email box of the sender of the original fax, etc.
Abstract:
A method of generating a self-authenticating document while utilizing document digest stored on a server for verification purposes. Authentication information for the document is encoded in barcode which is printed on the document. A document digest is calculated from the authentication information and transmitted to a server to be stored. When authenticating a scanned copy of the document, the barcode is read to extract the authentication information. A target document digest is calculated from the extracted authentication information and transmitted to the server for verification. The server compares the target document digest with the previously stored document digest. If they are not the same, the barcode has been altered. If they are the same, the extracted authentication information is used to authenticate the scanned copy. A document ID may be generated and transmitted to the server, and used by the server to index or search for the stored document digest.
Abstract:
A method is described to obtain a binary image from the print-and-scan process to best match the known original. A point-spread function (PSF) of the PAS process is first obtained from its knife-edge responses, and deblurring is carried out on the scanned images using deconvolution. After image deskewing and preliminary registration, a supervised adaptive thresholding procedure is utilized to binarize the scanned image such that a measure of difference (e.g. the Euclidean distance) between the original and binarized images is minimized. The supervised adaptive thresholding procedure divides the scanned images into many rectangular sub-images. Otsu's method is used to find a starting threshold for each scanned sub-image. An optimal threshold is found around the Otsu's threshold via iterative search to minimize the measure of difference between the original sub-image and scanned sub-image. The sub-images are binarized using the optimal threshold. This method may be used in document authentication.
Abstract:
A method for encoding and decoding color barcodes to increase their data capacity. The encoding steps include determining a shape, a foreground color and a background color for each data cell, wherein a combination of the shape, foreground and background colors for the data cell is chosen from a plurality of such combinations in accordance with a value of the digital data to be encoded; and coloring some pixels in the data cell with a foreground color and other pixels with a background color, in accordance with the shape, foreground and background colors for the data cell determined above. The decoding steps include segmenting the data cells, recognizing a shape, a foreground color of the shape and a background color of the data cell, and obtaining digital data from a combination of the shape and foreground and background colors in each data cell.
Abstract:
A document authentication method determines the authenticity of a target hardcopy document, which purports to be a true copy of an original hardcopy document. The method compares a binarized image of the target document with a binarized image of the original document which has been stored in a storage device. The image of the original document is generated by binarizing a scanned grayscale image of the original document. Halftone and non-halftone text areas in the grayscale image area separated, and the two types of text are separately binarized. The non-halftone text areas are then down-sampled. During authenticating, a scanned grayscale image of the target document is binarized by separating halftone and non-halftone text areas and binarizing them separately, and then down-sampling the non-halftone text areas. The binarized images of the target document and the original document are compared to determine the authenticity of the target document.
Abstract:
A document authentication method compares a target document image (scanned image) with an original document image at multiple levels, such as block (e.g. paragraph, graphics, image), line, word and character levels. The paragraph level comparison determines whether the target and original images have the same number of paragraphs and whether the paragraphs have the same sizes and locations; the line level comparison determines if the target and original images have the same number of lines and whether the lines have the same sizes and locations; etc. Document segmentation is performed on the target and original images to segment them into paragraph units, line units, etc. for purposes of the comparisons. The original document may be segmented beforehand and the segmentation information stored for later use. The authentication process may be designed to stop when alterations are detected at a higher level, so lower level comparisons are not carried out.