Abstract:
According to one embodiment, a system includes a processor and logic in and/or executable by the processor to cause the processor to: initiate a capture operation using an image capture component of the mobile device, the capture operation comprising; capturing video data; and estimating a plurality of motion vectors corresponding to motion of the image capture component during the capture operation; detect a document depicted in the video data; track a position of the detected document throughout the video data; select a plurality of images using the image capture component of the mobile device, wherein the selection is based at least in part on: the tracked position of the detected document; and the estimated motion vectors; and generate a composite image based on at least some of the selected plurality of images.
Abstract:
A computer program product includes program instructions configured to cause a processor, to: perform optical character recognition (OCR) on an image of a document; extract an identifier of the document from the image based at least in part on the OCR; compare at least portions of the identifier with content from one or more reference data sources; and determine whether the identifier is valid based at least in part on the comparison. The content comprises global address information; while the content from the reference is derived from geographic information. Deriving the content from the geographic information includes: obtaining the geographic information; and parsing the geographic information according to a set of predefined heuristic rules, where the heuristic rules are configured to normalize the global address information obtained from the one or more sources according to a single convention for representing address information.
Abstract:
According to one embodiment, a computer-implemented method for cleaning up a data set having a possible incorrect label includes: selecting a plurality of training documents; estimating a quality of an organization of a plurality of categories; and determining whether the quality of the organization is greater than a predetermined quality threshold. Corresponding system and computer program product embodiments are also presented. Other aspects and advantages of the present invention will become apparent from the following detailed description, which, when taken in conjunction with the drawings, illustrate by way of example the principles of the invention.
Abstract:
In one embodiment, a system includes a processor and logic executable by the processor. The logic is configured to cause the processor to: capture video data using a mobile device, the video data comprising a plurality of frames; determine whether one or more of the frames depict a document exhibiting one or more defining characteristics; determine whether one or more of the frame(s) determined to depict the document also satisfy one or more predetermined quality control criteria; and in response to determining one or more of the frames depict the document and also satisfy the one or more predetermined quality control criteria, automatically capture an image of the document. Corresponding computer program products are also disclosed.
Abstract:
In one approach, a method includes: capturing an image of a document using a camera of a mobile device; performing optical character recognition (OCR) on the image of the document; extracting data of interest from the image based at least in part on the OCR; and validating the extracted data of interest against reference information stored on the mobile device. In another embodiment, a method includes: capturing an image of a document using a camera of a mobile device; performing optical character recognition (OCR) on the image of the document; extracting data of interest from the image based at least in part on the OCR; and validating authenticity of the document based on comparing some or all of the extracted data of interest to reference information stored on the mobile device.
Abstract:
Systems, methods, and computer program products for smart, automated capture of textual information using optical sensors of a mobile device are disclosed. The capture and provision is context-aware, and determines context of the optical input, and invokes a contextually-appropriate workflow based thereon. The techniques also provide capability to normalize, correct, and/or validate the captured optical input and provide the corrected, normalized, validated, etc. information to the contextually-appropriate workflow. Other information necessary by the workflow and available to the mobile device optical sensors may also be captured and provided, in a single automatic process. As a result, the overall process of capturing information from optical input using a mobile device, invoking an appropriate workflow, and providing captured information to the workflow is significantly simplified and improved in terms of accuracy of data transfer/entry, speed and efficiency of workflows, and user experience.
Abstract:
A method includes: capturing or receiving at least one image of one or more identity documents (IDs) using a mobile device; determining identifying information from one or more of the IDs; building an ID profile based on the identifying information; storing the ID profile to a memory of the mobile device; invoking a workflow configured to facilitate a business transaction; detecting a predetermined stimulus in the workflow, the stimulus relating to the business transaction; providing at least a portion of the ID profile to the workflow in response to detecting the predetermined stimulus; and driving at least a portion of the workflow using the provided portion of the ID profile. Related systems and computer program products are also disclosed.
Abstract:
A method includes receiving or capturing a digital image using a mobile device, and using a processor of the mobile device to: determine whether an object depicted in the digital image belongs to a particular object class among a plurality of object classes; determine one or more object features of the object based at least in part on the particular object class at least partially in response to determining the object belongs to the particular object class; build or select an extraction model based at least in part on the one or more determined object features; and extract data from the digital image using the extraction model. The extraction model excludes, and/or the extraction process does not utilize, optical character recognition (OCR) techniques. Related systems and computer program products are also disclosed.
Abstract:
A method is provided for organizing data sets. In use, an automatic decision system is created or updated for determining whether data elements fit a predefined organization or not, where the decision system is based on a set of preorganized data elements. A plurality of data elements is organized using the decision system. At least one organized data element is selected for output to a user based on a score or confidence from the decision system for the at least one organized data element. Additionally, at least a portion of the at least one organized data element is output to the user. A response is received from the user comprising at least one of a confirmation, modification, and a negation of the organization of the at least one organized data element. The automatic decision system is recreated or updated based on the user response. Other embodiments are also presented.
Abstract:
A method involves: receiving an image comprising an ID; iteratively classifying the ID; and driving at least a portion of a workflow based at least in part on the classifying; wherein at least some of the classification iterations are based at least in part on comparing feature vector data, wherein a first classification iteration comprises determining the ID belongs to a particular class, and wherein each classification iteration subsequent to the first classification iteration comprises determining whether the ID belongs to a subclass falling within the particular class to which the ID was determined to belong in a prior classification iteration. Related systems and computer program products are also disclosed.