System and method for forms classification by line-art alignment
    1.
    发明授权
    System and method for forms classification by line-art alignment 有权
    通过线条对齐形式分类的系统和方法

    公开(公告)号:US08792715B2

    公开(公告)日:2014-07-29

    申请号:US13539941

    申请日:2012-07-02

    IPC分类号: G06K9/00

    CPC分类号: G06K9/00449

    摘要: A system and method to classify forms. An image representing a form of an unknown document type is received. The image includes line-art. Further, a plurality of template models corresponding to a plurality of different document types is received. The plurality of different document types is intended to include the correct document type of the unknown document. A subset of the plurality of template models are selected as candidate template models. The candidate template models include line-art junctions best matching line-art junctions of the received image. One of the candidate template models is selected as a best candidate template model. The best candidate template model includes horizontal and vertical lines best matching horizontal and vertical lines of the received image, respectively, aligned to the best candidate template model.

    摘要翻译: 一种用于分类表单的系统和方法。 接收到表示未知文档类型的形式的图像。 图像包括线条艺术。 此外,接收对应于多个不同文档类型的多个模板模型。 多个不同的文档类型旨在包括未知文档的正确文档类型。 选择多个模板模型的子集作为候选模板模型。 候选模板模型包括最佳匹配接收图像的线艺术结的线艺术结。 选择候选模板模型之一作为最佳候选模板模型。 最佳候选模板模型包括分别与最佳候选模板模型对齐的最佳匹配接收图像的水平和垂直线的水平和垂直线。

    Method and system for document image classification
    2.
    发明授权
    Method and system for document image classification 有权
    文件图像分类方法和系统

    公开(公告)号:US08520941B2

    公开(公告)日:2013-08-27

    申请号:US12330817

    申请日:2008-12-09

    IPC分类号: G06K9/00 G06K9/34 G06K9/68

    CPC分类号: G06K9/00456 G06K9/00483

    摘要: A method of classifying an input image includes the initial steps of labeling an input image in accordance with a class and extracting at least one connected component from the input image. The method also includes the steps of calculating at least one feature of the input image and generating a model based on the at least one calculated feature. The method also includes the steps of repeating at least one of the previous steps for at least one other input image and comparing the at least one other input image with the model. The at least one other input image is classified in accordance with the class of the model if the at least one calculated feature of the at least one other input image is substantially similar to that of the model.

    摘要翻译: 对输入图像进行分类的方法包括根据类别对输入图像进行标记并从输入图像提取至少一个连接分量的初始步骤。 该方法还包括以下步骤:计算输入图像的至少一个特征并基于至少一个计算的特征生成模型。 该方法还包括以下步骤:对至少一个其他输入图像重复前述步骤中的至少一个,并将该至少一个其它输入图像与模型进行比较。 如果所述至少一个其他输入图像的至少一个计算特征基本上类似于模型的特征,则至少一个其它输入图像根据模型的类被分类。

    SYSTEM AND METHOD FOR CLEAN DOCUMENT RECONSTRUCTION FROM ANNOTATED DOCUMENT IMAGES
    3.
    发明申请
    SYSTEM AND METHOD FOR CLEAN DOCUMENT RECONSTRUCTION FROM ANNOTATED DOCUMENT IMAGES 有权
    用于清除文件重建的系统和方法

    公开(公告)号:US20110311145A1

    公开(公告)日:2011-12-22

    申请号:US12819656

    申请日:2010-06-21

    IPC分类号: G06K9/46 G06F17/00

    摘要: A computer-implemented method and system for reconstructing a clean document from annotated document images and/or extracting annotations therefrom are provided. The method includes receiving a set of at least two annotated document images into computer memory, selecting a representative image from the set of annotated document images, performing a global alignment on each of the set of annotated document images with respect to the selected representative image, and forming a consensus document image based at least on the aligned annotated document images. A clean document based at least on the consensus document image is then formed which can be used for extracting the annotations.

    摘要翻译: 提供了一种用于从注释的文档图像重建干净的文档和/或从其提取注释的计算机实现的方法和系统。 该方法包括:将一组至少两个注释的文档图像接收到计算机存储器中,从所述一组注释文档图像中选择代表图像,对所选择的代表图像的所述一组注释文档图像中的每一个执行全局对齐, 以及至少基于对齐的注释文档图像形成共识文档图像。 然后形成至少基于共识文档图像的干净的文档,其可以用于提取注释。

    SYSTEM AND METHOD FOR FORMS CLASSIFICATION BY LINE-ART ALIGNMENT
    4.
    发明申请
    SYSTEM AND METHOD FOR FORMS CLASSIFICATION BY LINE-ART ALIGNMENT 有权
    用于通过线条对齐进行分类的系统和方法

    公开(公告)号:US20140003717A1

    公开(公告)日:2014-01-02

    申请号:US13539941

    申请日:2012-07-02

    IPC分类号: G06K9/68 G06K9/46

    CPC分类号: G06K9/00449

    摘要: A system and method to classify forms. An image representing a form of an unknown document type is received. The image includes line-art. Further, a plurality of template models corresponding to a plurality of different document types is received. The plurality of different document types is intended to include the correct document type of the unknown document. A subset of the plurality of template models are selected as candidate template models. The candidate template models include line-art junctions best matching line-art junctions of the received image. One of the candidate template models is selected as a best candidate template model. The best candidate template model includes horizontal and vertical lines best matching horizontal and vertical lines of the received image, respectively, aligned to the best candidate template model.

    摘要翻译: 一种用于分类表单的系统和方法。 接收到表示未知文档类型的形式的图像。 图像包括线条艺术。 此外,接收对应于多个不同文档类型的多个模板模型。 多个不同的文档类型旨在包括未知文档的正确文档类型。 选择多个模板模型的子集作为候选模板模型。 候选模板模型包括最佳匹配接收图像的线艺术结的线艺术结。 选择候选模板模型之一作为最佳候选模板模型。 最佳候选模板模型包括分别与最佳候选模板模型对齐的最佳匹配接收图像的水平和垂直线的水平和垂直线。

    System and method for clean document reconstruction from annotated document images
    5.
    发明授权
    System and method for clean document reconstruction from annotated document images 有权
    用于从注释文档图像清理文档重建的系统和方法

    公开(公告)号:US08606046B2

    公开(公告)日:2013-12-10

    申请号:US12819656

    申请日:2010-06-21

    IPC分类号: G06T7/00

    摘要: A computer-implemented method and system for reconstructing a clean document from annotated document images and/or extracting annotations therefrom are provided. The method includes receiving a set of at least two annotated document images into computer memory, selecting a representative image from the set of annotated document images, performing a global alignment on each of the set of annotated document images with respect to the selected representative image, and forming a consensus document image based at least on the aligned annotated document images. A clean document based at least on the consensus document image is then formed which can be used for extracting the annotations.

    摘要翻译: 提供了一种用于从注释的文档图像重建干净的文档和/或从其提取注释的计算机实现的方法和系统。 该方法包括:将一组至少两个注释的文档图像接收到计算机存储器中,从所述一组注释文档图像中选择代表图像,对所选择的代表图像的所述一组注释文档图像中的每一个执行全局对齐, 以及至少基于对齐的注释文档图像形成共识文档图像。 然后形成至少基于共识文档图像的干净的文档,其可以用于提取注释。

    METHOD AND SYSTEM FOR DOCUMENT IMAGE CLASSIFICATION
    8.
    发明申请
    METHOD AND SYSTEM FOR DOCUMENT IMAGE CLASSIFICATION 有权
    用于文件图像分类的方法和系统

    公开(公告)号:US20100142832A1

    公开(公告)日:2010-06-10

    申请号:US12330817

    申请日:2008-12-09

    IPC分类号: G06K9/68

    CPC分类号: G06K9/00456 G06K9/00483

    摘要: A method of classifying an input image includes the initial steps of labeling an input image in accordance with a class and extracting at least one connected component from the input image. The method also includes the steps of calculating at least one feature of the input image and generating a model based on the at least one calculated feature. The method also includes the steps of repeating at least one of the previous steps for at least one other input image and comparing the at least one other input image with the model. The at least one other input image is classified in accordance with the class of the model if the at least one calculated feature of the at least one other input image is substantially similar to that of the model.

    摘要翻译: 对输入图像进行分类的方法包括根据类别对输入图像进行标记并从输入图像提取至少一个连接分量的初始步骤。 该方法还包括以下步骤:计算输入图像的至少一个特征并基于至少一个计算的特征生成模型。 该方法还包括以下步骤:对至少一个其他输入图像重复前述步骤中的至少一个,并将该至少一个其它输入图像与模型进行比较。 如果所述至少一个其他输入图像的至少一个计算特征基本上类似于模型的特征,则至少一个其它输入图像根据模型的类被分类。