摘要:
Techniques for improved binarization and extraction of information from digital image data are disclosed in accordance with various embodiments. The inventive concepts include independently binarizing portions of the image data on the basis of individual features, e.g. per connected component, and using multiple different binarization thresholds to obtain the best possible binarization result for each portion of the image data independently binarized. Determining the quality of each binarization result may be based on attempted recognition and/or extraction of information therefrom. Independently binarized portions may be assembled into a contiguous result. In one embodiment, a method includes: identifying a region of interest within a digital image; generating a plurality of binarized images based on the region of interest using different binarization thresholds; and extracting data from some or all of the plurality of binarized images. Corresponding systems and computer program products are also disclosed.
摘要:
A method includes: displaying a digital image on a first portion of a display of a mobile device; receiving user feedback via the display of the mobile device; analyzing the user feedback to determine a meaning of the user feedback; based on the determined meaning of the user feedback, analyzing a portion of the digital image corresponding to either the point of interest or the region of interest to detect one or more connected components depicted within the portion of the digital image; classifying each detected connected component depicted within the portion of the digital image; estimating an identity of each detected connected component based on the classification of the detected connected component; and one or more of: displaying the identity of each detected connected component on a second portion of the display of the mobile device; and providing the identity of each detected connected component to a workflow.
摘要:
Computer program products for discriminating hand and machine print from each other, and from signatures, are disclosed and include program code readable and/or executable by a processor to: receive an image, determine a color depth of the image; reducing the color depth of non-bi-tonal images to generate a bi-tonal representation of the image; identify a set of one or more graphical line candidates in either the bi-tonal image or the bi-tonal representation, the graphical line candidates including true graphical lines and/or false positives; discriminate any of the true graphical lines from any of the false positives; remove the true graphical lines from the bi-tonal image or the bi-tonal representation without removing the false positives to generate a component map comprising connected components and excluding graphical lines; identify one or more of the connected components in the component map; and output and/or display and indicator of each of the connected components.
摘要:
Systems, computer program products, and techniques for detecting objects depicted in digital image data are disclosed, according to various exemplary embodiments. The inventive concepts uniquely utilize internal features to accomplish object detection, thereby avoiding reliance on detecting object edges and/or transitions between the object and other portions of the digital image data, e.g. background textures or other objects. The inventive concepts thus provide an improvement over conventional object detection since objects may be detected even when edges are obscured or not depicted in the digital image data. In one aspect, a computer-implemented method of detecting an object depicted in a digital image includes: detecting a plurality of identifying features of the object, wherein the plurality of identifying features are located internally with respect to the object; and projecting a location of one or more edges of the object based at least in part on the plurality of identifying features.
摘要:
According to one embodiment, a computer-implemented method for confirming/rejecting a most relevant example includes: generating a binary decision model by training a binary classifier using a plurality of training documents; classifying one or more test documents into one of a plurality of categories using the binary decision model, wherein the one or more test documents lack a user-defined category label; selecting a most relevant example of the classified test documents from among the classified test documents; displaying, using a display of the computer, the most relevant example of the classified test documents to a user; receiving, via the computer and from the user, a confirmation or a negation of a classification label of the most relevant example of the classified test documents; and storing the confirmation or the negation of the classification label of the most relevant example of the classified test documents to a memory of the computer.
摘要:
Techniques for capturing long document images and generating composite images therefrom include: detecting a document depicted in image data; tracking a position of the detected document within the image data; selecting a plurality of images, wherein the selection is based at least in part on the tracked position of the detected document; and generating a composite image based on at least one of the selected plurality of images. The tracking and selection are optionally but preferably based in whole or in part on motion vectors estimated at least partially based on analyzing image data such as test and reference frames within the captured video data/images. Corresponding systems and computer program products are also disclosed.
摘要:
In one embodiment, a method includes receiving, at a mobile device, an image depicting a document; attempting to classify, using a processor of the mobile device, the document depicted in the image to one of a plurality of predetermined document classes, wherein attempting to classify the document results in an ambiguous classification result; determining, using the mobile device, location information identifying a geographic location of the mobile device at a particular time; and disambiguating, using the processor of the mobile device, the ambiguous classification result based on the location information. Exemplary systems and computer program products are also described.
摘要:
According to one embodiment, a computer-implemented method includes: capturing an image of a document using a camera of a mobile device; performing optical character recognition (OCR) on the image of the document; extracting an identifier of the document from the image based at least in part on the OCR; comparing the identifier with content from one or more reference data sources, wherein the content from the one or more reference data sources comprises global address information; and determining whether the identifier is valid based at least in part on the comparison. The method may optionally include normalizing the extracted identifier, retrieving additional geographic information, correcting OCR errors, etc. based on comparing extracted information with reference content. Corresponding systems and computer program products are also disclosed.
摘要:
Systems, methods, and computer program products are disclosed and include: initiating a capture operation using an image capture component of the mobile device, the capture operation comprising; capturing video data; and estimating a plurality of motion vectors corresponding to motion of the image capture component during the capture operation. The systems, techniques, and computer program products also include detecting a document depicted in the video data; tracking a position of the detected document throughout the video data; selecting a plurality of images using the image capture component of the mobile device, wherein the selection is based at least in part on: the tracked position of the detected document; and the estimated motion vectors; and generating a composite image based on at least some of the selected plurality of images.
摘要:
Systems, methods, and computer program products for smart, automated capture of textual information using optical sensors of a mobile device are disclosed. The textual information is provided to a mobile application or workflow without requiring the user to manually enter or transfer the data without requiring user intervention such as a copy/paste operation. The capture and provision context-aware, and can normalize or validate the captured textual information prior to entry in the workflow or mobile application. Other information necessary by the workflow and available to the mobile device optical sensors may also be captured and provided, in a single automatic process. As a result, the overall process of capturing information from optical input using a mobile device is significantly simplified and improved in terms of accuracy of data transfer/entry, speed and efficiency of workflows, and user experience.