Platenless book scanner with line buffering to compensate for image skew
    1.
    发明授权
    Platenless book scanner with line buffering to compensate for image skew 失效
    无压纸书扫描器,具有线缓冲功能,可补偿图像偏斜

    公开(公告)号:US5764383A

    公开(公告)日:1998-06-09

    申请号:US657704

    申请日:1996-05-30

    摘要: A platenless book scanner with line buffering performs electronic perspective correction to account for rotation of the spine of a non-planar bound document relative to a reference line in a support plane of the platenless book scanner. A pre-scan of the non-planar bound document is performed to provide a geometrical contour map of the bound document. The geometrical contour map, which identifies displacement of the bound document from the support plane, is analyzed to calculate an angular offset between a spine of the bound document and the reference line in the support plane. The angular offset is used to identify a minimum number of scan line buffers for recording image data, from a set of scan line buffers. Once the minimum number of scan line buffers is filled with recorded image data, distortions caused by displacements of the non-planar bound document from the support plane and skew of the bound document relative to the reference line in the support plane are corrected. A scan line of image data is corrected by polling locations in the set of scan line buffers in accordance with the geometrical contour map, and by interpolating the polled locations to provide output pixels. To correct additional scan lines of image data some of the image data stored in the minimum number of image buffers is replaced after recording more image data.

    摘要翻译: 具有线缓冲器的无压纸书扫描器执行电子透视校正,以考虑非平面装订文档的脊柱相对于无稿机扫描器的支撑平面中的参考线的旋转。 执行非平面装订文档的预扫描以提供绑定文档的几何轮廓图。 分析识别绑定文档从支撑平面的位移的几何等值线图,以计算绑定文档的脊柱与支撑平面中的参考线之间的角度偏移。 角偏移用于从一组扫描线缓冲区中识别用于记录图像数据的扫描线缓冲器的最小数量。 一旦扫描线缓冲器的最小数量被填充有记录的图像数据,则校正由非平面装订文档从支撑平面的位移引起的失真和绑定文档相对于支撑平面中的参考线的偏斜。 通过根据几何等高线图在扫描线缓冲器组中的轮询位置以及内插轮询位置以提供输出像素来校正图像数据的扫描线。 为了校正图像数据的附加扫描线,在记录更多图像数据之后,替换存储在最小数量的图像缓冲器中的一些图像数据。

    Platenless book scanning system with a general imaging geometry
    2.
    发明授权
    Platenless book scanning system with a general imaging geometry 失效
    无普通书籍扫描系统,具有一般的成像几何

    公开(公告)号:US5760925A

    公开(公告)日:1998-06-02

    申请号:US657711

    申请日:1996-05-30

    摘要: An overhead scanning system records pages from bound documents in an upright and open condition. The scanning system is defined with a general imaging geometry that makes the scanning system readily portable, and provides the scanning system with a variable imaging area. Once an operator defines an imaging area of an image acquisition system, the operator positions a light stripe projector to project across the imaging area. After recording calibration data, a perspective transform is provided by a perspective transform generator. In operation, a first image of the bound document having a light stripe projected there across is recorded by the image acquisition system. A page shape transform generator is then used to derive a page shape transform. Subsequently, a second image of the bound document is recorded without projecting a light stripe thereacross. If the second image is warped because of foreshortening, or magnification due to the pages of the bound document being curved, an image correction system de-warps the second image using the perspective and page shape transforms. The de-warped image is reconstructed by "polling" each location in the second image to determine the value of each pixel in the de-warped image.

    摘要翻译: 架空扫描系统以直立和打开的状态记录来自绑定文档的页面。 扫描系统由通常的成像几何形状定义,使扫描系统容易携带,并为扫描系统提供可变成像区域。 一旦操作者定义了图像采集系统的成像区域,操作员就将一个光条投影仪放置在投影区域上。 在记录校准数据之后,通过透视变换发生器提供透视变换。 在操作中,由图像采集系统记录具有投影到其上的光条纹的装订文档的第一图像。 然后使用页面形状变换生成器来导出页面形状变换。 随后,记录绑定文档的第二图像,而不在其上突出轻条纹。 如果第二图像由于缩短而变形,或者由于装订文档的页面被弯曲而造成的放大率,则图像校正系统使用透视和页面形状变换来扭曲第二图像。 通过“轮询”第二图像中的每个位置来重构去翘曲的图像,以确定去翘曲图像中的每个像素的值。

    System and method for forms recognition by synthesizing corrected localization of data fields
    3.
    发明授权
    System and method for forms recognition by synthesizing corrected localization of data fields 有权
    通过合成数据字段的校正定位来进行表单识别的系统和方法

    公开(公告)号:US09536141B2

    公开(公告)日:2017-01-03

    申请号:US13537729

    申请日:2012-06-29

    申请人: Eric Saund

    发明人: Eric Saund

    IPC分类号: G06F17/00 G06K9/00 G06F17/24

    摘要: A method and system generates an idealized image of a form. An image of a form and a template model of the form are received. The form includes data fields. Word boxes of the image are identified. The word boxes are assigned to corresponding data fields of the form. An idealized image of the from is generated based on the assignments and the template model.

    摘要翻译: 一种方法和系统产生一个形式的理想化图像。 接收表单的图像和表单的模板模型。 表单包括数据字段。 识别图像的字框。 单词框被分配给表单的相应数据字段。 基于分配和模板模型生成来自的理想化图像。

    Method for generating a graph lattice from a corpus of one or more data graphs

    公开(公告)号:US08872828B2

    公开(公告)日:2014-10-28

    申请号:US12883464

    申请日:2010-09-16

    申请人: Eric Saund

    发明人: Eric Saund

    IPC分类号: G06T17/20 G06T11/20

    CPC分类号: G06T11/206

    摘要: A document recognition system and method, where images are represented as a collection of primitive features whose spatial relations are represented as a graph. Useful subsets of all the possible subgraphs representing different portions of images are represented over a corpus of many images. The data structure is a lattice of subgraphs, and algorithms are provided means to build and use the graph lattice efficiently and effectively.

    System and method for forms classification by line-art alignment
    6.
    发明授权
    System and method for forms classification by line-art alignment 有权
    通过线条对齐形式分类的系统和方法

    公开(公告)号:US08792715B2

    公开(公告)日:2014-07-29

    申请号:US13539941

    申请日:2012-07-02

    IPC分类号: G06K9/00

    CPC分类号: G06K9/00449

    摘要: A system and method to classify forms. An image representing a form of an unknown document type is received. The image includes line-art. Further, a plurality of template models corresponding to a plurality of different document types is received. The plurality of different document types is intended to include the correct document type of the unknown document. A subset of the plurality of template models are selected as candidate template models. The candidate template models include line-art junctions best matching line-art junctions of the received image. One of the candidate template models is selected as a best candidate template model. The best candidate template model includes horizontal and vertical lines best matching horizontal and vertical lines of the received image, respectively, aligned to the best candidate template model.

    摘要翻译: 一种用于分类表单的系统和方法。 接收到表示未知文档类型的形式的图像。 图像包括线条艺术。 此外,接收对应于多个不同文档类型的多个模板模型。 多个不同的文档类型旨在包括未知文档的正确文档类型。 选择多个模板模型的子集作为候选模板模型。 候选模板模型包括最佳匹配接收图像的线艺术结的线艺术结。 选择候选模板模型之一作为最佳候选模板模型。 最佳候选模板模型包括分别与最佳候选模板模型对齐的最佳匹配接收图像的水平和垂直线的水平和垂直线。

    System and method for localizing data fields on structured and semi-structured forms
    7.
    发明授权
    System and method for localizing data fields on structured and semi-structured forms 有权
    用于本地化结构化和半结构化形式的数据字段的系统和方法

    公开(公告)号:US08781229B2

    公开(公告)日:2014-07-15

    申请号:US13537630

    申请日:2012-06-29

    申请人: Eric Saund

    发明人: Eric Saund

    IPC分类号: G06K9/34

    摘要: A method and system to localize data fields of a form. An image of a form is received, where the form includes data fields. Word boxes of the image are identified. The word boxes are grouped into candidate zones, where each of the candidate zones includes one or more of the word boxes. Hypotheses are formed from the data fields and the candidate zones, where each hypothesis assigns one of the candidate zones to one of the data fields or a null data field. A constrained optimization search of the hypotheses is performed for an optimal set of hypotheses. The optimal set of hypotheses assigns word box groups to corresponding data fields.

    摘要翻译: 本地化表单数据字段的方法和系统。 收到表单的图像,其中表单包括数据字段。 识别图像的字框。 单词框被分组成候选区域,其中每个候选区域包括一个或多个单词框。 假设从数据字段和候选区域形成,其中每个假设将一个候选区域分配给数据字段之一或空数据字段。 对于最优假设集执行假设的约束优化搜索。 最佳假设集合将字框组分配给相应的数据字段。

    Graph lattice method for image clustering, classification, and repeated structure finding
    8.
    发明授权
    Graph lattice method for image clustering, classification, and repeated structure finding 有权
    用于图像聚类,分类和重复结构查找的图形格子方法

    公开(公告)号:US08724911B2

    公开(公告)日:2014-05-13

    申请号:US12883503

    申请日:2010-09-16

    申请人: Eric Saund

    发明人: Eric Saund

    CPC分类号: G06K9/6892 G06K9/00449

    摘要: A document recognition system and method, where images are represented as a collection of primitive features whose spatial relations are represented as a graph. Useful subsets of all the possible subgraphs representing different portions of images are represented over a corpus of many images. The data structure is a lattice of subgraphs, and algorithms are provided means to build and use the graph lattice efficiently and effectively.

    摘要翻译: 一种文档识别系统和方法,其中图像被表示为其空间关系被表示为图形的原始特征的集合。 表示图像的不同部分的所有可能子图的有用子集在许多图像的语料库上表示。 数据结构是子图的格子,提供了有效和高效地构建和使用图形格子的算法。

    SELECTIVE LEARNING FOR GROWING A GRAPH LATTICE
    9.
    发明申请
    SELECTIVE LEARNING FOR GROWING A GRAPH LATTICE 有权
    选择学习用于生成图形格式

    公开(公告)号:US20130335422A1

    公开(公告)日:2013-12-19

    申请号:US13527071

    申请日:2012-06-19

    申请人: Eric Saund

    发明人: Eric Saund

    IPC分类号: G06T11/20

    CPC分类号: G06T11/206 G06K9/00

    摘要: A system and method generate a graph lattice from exemplary images. At least one processor receives exemplary data graphs of the exemplary images and generates graph lattice nodes of size one from primitives. Until a termination condition is met, the at least one processor repeatedly: 1) generates candidate graph lattice nodes from accepted graph lattice nodes; 2) selects one or more candidate graph lattice nodes preferentially discriminating exemplary data graphs which are less discriminable than other exemplary data graphs using the accepted graph lattice nodes; and 3) promotes the selected graph lattice nodes to accepted status. The graph lattice is formed from the accepted graph lattice nodes and relations between the accepted graph lattice nodes.

    摘要翻译: 系统和方法从示例性图像生成图形点阵。 至少一个处理器接收示例性图像的示例性数据图,并从图元生成大小为1的图形格子节点。 在满足终止条件之前,所述至少一个处理器重复:1)从接受的图形格子节点生成候选图格点阵; 2)选择一个或多个候选图形格子节点优先区分使用所接受的图形格子节点而不比其他示例性数据图可辨别的示例性数据图; 和3)促进所选择的图形点阵节点接受状态。 图形格子由公认的图形点阵节点和接受的图形点阵节点之间的关系形成。

    METHOD FOR GENERATING A GRAPH LATTICE FROM A CORPUS OF ONE OR MORE DATA GRAPHS
    10.
    发明申请
    METHOD FOR GENERATING A GRAPH LATTICE FROM A CORPUS OF ONE OR MORE DATA GRAPHS 有权
    从一个或多个数据图形的公司生成图形格式的方法

    公开(公告)号:US20120069024A1

    公开(公告)日:2012-03-22

    申请号:US12883464

    申请日:2010-09-16

    申请人: Eric Saund

    发明人: Eric Saund

    IPC分类号: G06T11/20

    CPC分类号: G06T11/206

    摘要: A document recognition system and method, where images are represented as a collection of primitive features whose spatial relations are represented as a graph. Useful subsets of all the possible subgraphs representing different portions of images are represented over a corpus of many images. The data structure is a lattice of subgraphs, and algorithms are provided means to build and use the graph lattice efficiently and effectively.

    摘要翻译: 一种文档识别系统和方法,其中图像被表示为其空间关系被表示为图形的原始特征的集合。 表示图像的不同部分的所有可能子图的有用子集在许多图像的语料库上表示。 数据结构是子图的格子,提供了有效和高效地构建和使用图形格子的算法。