Building classification and extraction models based on electronic forms

    公开(公告)号:US10140511B2

    公开(公告)日:2018-11-27

    申请号:US15396322

    申请日:2016-12-30

    Applicant: Kofax, Inc.

    Abstract: According to one embodiment, a computer-implemented method is configured for building a classification and/or data extraction knowledge base using an electronic form. The method includes: receiving an electronic form having associated therewith a plurality of metadata labels, each metadata label corresponding to at least one element of interest represented within the electronic form; parsing the plurality of metadata labels to determine characteristic features of the element(s) of interest; building a representation of the electronic form based on the plurality of metadata labels; generating a plurality of permutations of the representation of the electronic form by applying a predetermined set of variations to the representation; and training either a classification model, an extraction model, or both using: the representation of the electronic form, and the plurality of permutations of the representation of the electronic form. Corresponding systems and computer program products are also disclosed.

    Systems, methods and computer program products for determining document validity
    99.
    发明授权
    Systems, methods and computer program products for determining document validity 有权
    用于确定文件有效性的系统,方法和计算机程序产品

    公开(公告)号:US09576272B2

    公开(公告)日:2017-02-21

    申请号:US14804278

    申请日:2015-07-20

    Applicant: Kofax, Inc.

    Abstract: In one approach, a method includes: capturing an image of a document using a camera of a mobile device; performing optical character recognition (OCR) on the image of the document; extracting data of interest from the image based at least in part on the OCR; and validating the extracted data of interest against reference information stored on the mobile device. In another embodiment, a method includes: capturing an image of a document using a camera of a mobile device; performing optical character recognition (OCR) on the image of the document; extracting data of interest from the image based at least in part on the OCR; and validating authenticity of the document based on comparing some or all of the extracted data of interest to reference information stored on the mobile device.

    Abstract translation: 在一种方法中,一种方法包括:使用移动设备的照相机捕获文档的图像; 对文档的图像执行光学字符识别(OCR); 至少部分地基于OCR从图像提取感兴趣的数据; 以及根据存储在移动设备上的参考信息来验证提取的感兴趣的数据。 在另一个实施例中,一种方法包括:使用移动设备的照相机捕获文档的图像; 对文档的图像执行光学字符识别(OCR); 至少部分地基于OCR从图像提取感兴趣的数据; 以及基于将所提取的所提取的数据与存储在所述移动设备上的参考信息进行比较来验证所述文档的真实性。

Patent Agency Ranking