Iterative recognition-guided thresholding and data extraction

    公开(公告)号:US10242285B2

    公开(公告)日:2019-03-26

    申请号:US15214351

    申请日:2016-07-19

    申请人: Kofax, Inc.

    摘要: Techniques for improved binarization and extraction of information from digital image data are disclosed in accordance with various embodiments. The inventive concepts include independently binarizing portions of the image data on the basis of individual features, e.g. per connected component, and using multiple different binarization thresholds to obtain the best possible binarization result for each portion of the image data independently binarized. Determining the quality of each binarization result may be based on attempted recognition and/or extraction of information therefrom. Independently binarized portions may be assembled into a contiguous result. In one embodiment, a method includes: identifying a region of interest within a digital image; generating a plurality of binarized images based on the region of interest using different binarization thresholds; and extracting data from some or all of the plurality of binarized images. Corresponding systems and computer program products are also disclosed.

    Selective, user-mediated content recognition using mobile devices

    公开(公告)号:US10049268B2

    公开(公告)日:2018-08-14

    申请号:US15059242

    申请日:2016-03-02

    申请人: Kofax, Inc.

    摘要: A method includes: displaying a digital image on a first portion of a display of a mobile device; receiving user feedback via the display of the mobile device; analyzing the user feedback to determine a meaning of the user feedback; based on the determined meaning of the user feedback, analyzing a portion of the digital image corresponding to either the point of interest or the region of interest to detect one or more connected components depicted within the portion of the digital image; classifying each detected connected component depicted within the portion of the digital image; estimating an identity of each detected connected component based on the classification of the detected connected component; and one or more of: displaying the identity of each detected connected component on a second portion of the display of the mobile device; and providing the identity of each detected connected component to a workflow.

    MACHINE PRINT, HAND PRINT, AND SIGNATURE DISCRIMINATION

    公开(公告)号:US20180189558A1

    公开(公告)日:2018-07-05

    申请号:US15910797

    申请日:2018-03-02

    申请人: Kofax, Inc.

    IPC分类号: G06K9/00 G06K9/34

    摘要: Computer program products for discriminating hand and machine print from each other, and from signatures, are disclosed and include program code readable and/or executable by a processor to: receive an image, determine a color depth of the image; reducing the color depth of non-bi-tonal images to generate a bi-tonal representation of the image; identify a set of one or more graphical line candidates in either the bi-tonal image or the bi-tonal representation, the graphical line candidates including true graphical lines and/or false positives; discriminate any of the true graphical lines from any of the false positives; remove the true graphical lines from the bi-tonal image or the bi-tonal representation without removing the false positives to generate a component map comprising connected components and excluding graphical lines; identify one or more of the connected components in the component map; and output and/or display and indicator of each of the connected components.

    Systems and methods for organizing data sets

    公开(公告)号:US09754014B2

    公开(公告)日:2017-09-05

    申请号:US15422435

    申请日:2017-02-01

    申请人: Kofax, Inc.

    IPC分类号: G06F17/30

    摘要: According to one embodiment, a computer-implemented method for confirming/rejecting a most relevant example includes: generating a binary decision model by training a binary classifier using a plurality of training documents; classifying one or more test documents into one of a plurality of categories using the binary decision model, wherein the one or more test documents lack a user-defined category label; selecting a most relevant example of the classified test documents from among the classified test documents; displaying, using a display of the computer, the most relevant example of the classified test documents to a user; receiving, via the computer and from the user, a confirmation or a negation of a classification label of the most relevant example of the classified test documents; and storing the confirmation or the negation of the classification label of the most relevant example of the classified test documents to a memory of the computer.

    GLOBAL GEOGRAPHIC INFORMATION RETRIEVAL, VALIDATION, AND NORMALIZATION
    38.
    发明申请
    GLOBAL GEOGRAPHIC INFORMATION RETRIEVAL, VALIDATION, AND NORMALIZATION 有权
    全球地理信息检索,验证和正规化

    公开(公告)号:US20160328610A1

    公开(公告)日:2016-11-10

    申请号:US15146848

    申请日:2016-05-04

    申请人: Kofax, Inc.

    摘要: According to one embodiment, a computer-implemented method includes: capturing an image of a document using a camera of a mobile device; performing optical character recognition (OCR) on the image of the document; extracting an identifier of the document from the image based at least in part on the OCR; comparing the identifier with content from one or more reference data sources, wherein the content from the one or more reference data sources comprises global address information; and determining whether the identifier is valid based at least in part on the comparison. The method may optionally include normalizing the extracted identifier, retrieving additional geographic information, correcting OCR errors, etc. based on comparing extracted information with reference content. Corresponding systems and computer program products are also disclosed.

    摘要翻译: 根据一个实施例,计算机实现的方法包括:使用移动设备的照相机捕获文档的图像; 对文档的图像执行光学字符识别(OCR); 至少部分地基于所述OCR从所述图像提取所述文档的标识符; 将所述标识符与来自一个或多个参考数据源的内容进行比较,其中来自所述一个或多个参考数据源的内容包括全局地址信息; 以及至少部分地基于所述比较来确定所述标识符是否有效。 该方法可以可选地包括基于将提取的信息与参考内容进行比较来归一化所提取的标识符,检索附加地理信息,校正OCR错误等。 还公开了相应的系统和计算机程序产品。

    Systems and methods for generating composite images of long documents using mobile video data
    39.
    发明授权
    Systems and methods for generating composite images of long documents using mobile video data 有权
    使用移动视频数据生成长文档的合成图像的系统和方法

    公开(公告)号:US09386235B2

    公开(公告)日:2016-07-05

    申请号:US14542157

    申请日:2014-11-14

    申请人: Kofax, Inc.

    摘要: Systems, methods, and computer program products are disclosed and include: initiating a capture operation using an image capture component of the mobile device, the capture operation comprising; capturing video data; and estimating a plurality of motion vectors corresponding to motion of the image capture component during the capture operation. The systems, techniques, and computer program products also include detecting a document depicted in the video data; tracking a position of the detected document throughout the video data; selecting a plurality of images using the image capture component of the mobile device, wherein the selection is based at least in part on: the tracked position of the detected document; and the estimated motion vectors; and generating a composite image based on at least some of the selected plurality of images.

    摘要翻译: 公开了系统,方法和计算机程序产品,并且包括:使用移动设备的图像捕获组件启动捕获操作,所述捕获操作包括: 捕获视频数据; 以及在捕获操作期间估计与所述图像捕获部件的运动相对应的多个运动矢量。 系统,技术和计算机程序产品还包括检测在视频数据中描绘的文档; 跟踪检测到的文档在整个视频数据中的位置; 使用所述移动设备的图像捕获组件来选择多个图像,其中所述选择至少部分地基于:所检测到的文档的跟踪位置; 和估计的运动矢量; 以及基于所选择的多个图像中的至少一些来生成合成图像。

    Smart optical input/output (I/O) extension for context-dependent workflows
    40.
    发明授权
    Smart optical input/output (I/O) extension for context-dependent workflows 有权
    用于上下文相关工作流的智能光输入/输出(I / O)扩展

    公开(公告)号:US09349046B2

    公开(公告)日:2016-05-24

    申请号:US14686644

    申请日:2015-04-14

    申请人: Kofax, Inc.

    摘要: Systems, methods, and computer program products for smart, automated capture of textual information using optical sensors of a mobile device are disclosed. The textual information is provided to a mobile application or workflow without requiring the user to manually enter or transfer the data without requiring user intervention such as a copy/paste operation. The capture and provision context-aware, and can normalize or validate the captured textual information prior to entry in the workflow or mobile application. Other information necessary by the workflow and available to the mobile device optical sensors may also be captured and provided, in a single automatic process. As a result, the overall process of capturing information from optical input using a mobile device is significantly simplified and improved in terms of accuracy of data transfer/entry, speed and efficiency of workflows, and user experience.

    摘要翻译: 公开了使用移动设备的光学传感器智能地自动捕获文本信息的系统,方法和计算机程序产品。 将文本信息提供给移动应用程序或工作流程,而不需要用户手动输入或传输数据,而无需用户干预,如复制/粘贴操作。 捕获和提供上下文感知,并且可以在进入工作流或移动应用程序之前规范化或验证捕获的文本信息。 可以在单个自动过程中捕获并提供工作流所需的可用于移动设备光学传感器的其它信息。 因此,在数据传输/输入的准确性,工作流程的速度和效率以及用户体验方面,使用移动设备从光学输入捕获信息的整个过程得到了显着的简化和改进。