SYSTEM AND METHOD FACILITATING DOCUMENT IMAGE COMPRESSION UTILIZING A MASK
    2.
    发明申请
    SYSTEM AND METHOD FACILITATING DOCUMENT IMAGE COMPRESSION UTILIZING A MASK 有权
    系统和方法利用文字图像压缩文件

    公开(公告)号:US20060274381A1

    公开(公告)日:2006-12-07

    申请号:US11465083

    申请日:2006-08-16

    CPC classification number: G06K9/38 G06K2209/01

    Abstract: A system and method facilitating document image compression utilizing a mask separating a foreground of a document image from a background is provided. The invention includes a pixel energy analyzer adapted to partition regions into a foreground and background. The invention further provides for a merge region component adapted to attempt to merge regions if the merged region would not exceed a threshold energy. Merged regions are partitioned into a new foreground and new background. Thereafter, a mask storage component stores the partitioning information in a binary mask.

    Abstract translation: 提供了利用从背景分离文档图像的前景的掩模来促进文档图像压缩的系统和方法。 本发明包括适于将区域分割成前景和背景的像素能量分析器。 本发明还提供了一种合并区域组件,其适于在合并区域不超过阈值能量时试图合并区域。 合并的区域被划分为新的前景和新的背景。 此后,掩模存储部件将分割信息存储在二进制掩码中。

    Processing an electronic document for information extraction
    3.
    发明申请
    Processing an electronic document for information extraction 有权
    处理电子文件进行信息提取

    公开(公告)号:US20050125746A1

    公开(公告)日:2005-06-09

    申请号:US10909534

    申请日:2004-08-02

    Abstract: The present invention relates to a method of automatically processing an electronic document for routing over a computer network. The method includes recognizing text in the document to identify a candidate address, accessing a collection of potential destinations and comparing the candidate address to the collection of potential destinations to determine a destination for the document.

    Abstract translation: 本发明涉及一种自动处理电子文档以在计算机网络上路由的方法。 该方法包括识别文档中的文本以识别候选地址,访问潜在目的地的集合,并将候选地址与潜在目的地的收集进行比较以确定文档的目的地。

    SEGMENTED LAYERED IMAGE SYSTEM
    5.
    发明申请
    SEGMENTED LAYERED IMAGE SYSTEM 有权
    SEGMENTED层状图像系统

    公开(公告)号:US20070025622A1

    公开(公告)日:2007-02-01

    申请号:US11465087

    申请日:2006-08-16

    CPC classification number: H04N1/403 G06K9/00456

    Abstract: Systems and methods for encoding and decoding document images are disclosed. Document images are segmented into multiple layers according to a mask. The multiple layers are non-binary. The respective layers can then be processed and compressed separately in order to achieve better compression of the document image overall. A mask is generated from a document image. The mask is generated so as to reduce an estimate of compression for the combined size of the mask and multiple layers of the document image. The mask is then employed to segment the document image into the multiple layers. The mask determines or allocates pixels of the document image into respective layers. The mask and the multiple layers are processed and encoded separately so as to improve compression of the document image overall and to improve the speed of so doing. The multiple layers are non-binary images and can, for example, comprise a foreground image and a background image.

    Abstract translation: 公开了用于编码和解码文档图像的系统和方法。 根据掩码将文档图像分割成多个图层。 多层是非二进制的。 然后可以分别对各个层进行处理和压缩,以便对整个文件图像实现更好的压缩。 从文档图像生成蒙版。 生成掩模,以减少对于掩模和文档图像的多个层的组合大小的压缩估计。 然后使用掩模将文档图像分割成多个层。 掩模将文档图像的像素确定或分配到各个图层中。 掩模和多层被单独处理和编码,以便整体上改善文档图像的压缩并提高这样做的速度。 多层是非二进制图像,并且可以例如包括前景图像和背景图像。

    Low resolution OCR for camera acquired documents
    6.
    发明申请
    Low resolution OCR for camera acquired documents 有权
    相机采集文件的低分辨率OCR

    公开(公告)号:US20050259866A1

    公开(公告)日:2005-11-24

    申请号:US10850335

    申请日:2004-05-20

    Abstract: A global optimization framework for optical character recognition (OCR) of low-resolution photographed documents that combines a binarization-type process, segmentation, and recognition into a single process. The framework includes a machine learning approach trained on a large amount of data. A convolutional neural network can be employed to compute a classification function at multiple positions and take grey-level input which eliminates binarization. The framework utilizes preprocessing, layout analysis, character recognition, and word recognition to output high recognition rates. The framework also employs dynamic programming and language models to arrive at the desired output.

    Abstract translation: 低分辨率拍摄文档的光学字符识别(OCR)的全局优化框架,将二值化类型过程,分割和识别结合到一个过程中。 该框架包括对大量数据进行培训的机器学习方法。 可以采用卷积神经网络来计算多个位置的分类函数,并采用消除二值化的灰度级输入。 该框架利用预处理,布局分析,字符识别和字识别来输出高识别率。 该框架还采用动态编程和语言模型来达到所需的输出。

Patent Agency Ranking