Method and means for extracting fixed-pitch characters on noisy images
with complex background prior to character recognition
    1.
    发明授权
    Method and means for extracting fixed-pitch characters on noisy images with complex background prior to character recognition 失效
    在字符识别之前用复杂背景的噪声图像提取固定音调字符的方法和装置

    公开(公告)号:US5915039A

    公开(公告)日:1999-06-22

    申请号:US747350

    申请日:1996-11-12

    IPC分类号: G06K9/20 G06K9/32 G06K9/72

    摘要: Fixed-pitch, fixed-font characters embedded in a noisy gray-scale image of picture elements (pels) within a complex background can be extracted prior to execution of any recognition operations by first deriving a normalized Boolean-coded image from the gray-scale image. Then, a subset of at least three uncontaminated character triples is formed by filtering the Boolean-coded image. Next, an affine transform is approximated from locations in the Boolean-coded image of at least three noncollinear ones of the uncontaminated character triples. Lastly, the locations in a logical matrix array of all possible character triples are estimated according to the affine transform.

    摘要翻译: 在执行任何识别操作之前,可以提取嵌入在复杂背景中的图像元素(像素)的噪声灰度图像中的固定间距固定字体字符,首先从灰度级导出归一化布尔编码图像 图片。 然后,通过对布尔编码的图像进行滤波来形成至少三个未受污染的字符三元组的子集。 接下来,仿射变换从未受污染的字符三元组中的至少三个非共线图像的布尔编码图像中的位置近似。 最后,根据仿射变换估计所有可能的字符三元组的逻辑矩阵阵列中的位置。

    Long term archiving of digital information
    2.
    发明授权
    Long term archiving of digital information 失效
    长期归档数字信息

    公开(公告)号:US06691309B1

    公开(公告)日:2004-02-10

    申请号:US09513345

    申请日:2000-02-25

    IPC分类号: G06F9445

    CPC分类号: G06F9/45504 G06F17/30073

    摘要: Digital data is preserved by archiving on a removable medium. In the long term, the save data bit stream must be correctly interpreted. For a computer program or system to be archived, the bit stream constituting the program must be archived and the code must be executable at restore time. The program that restores the data does not “see” the contents of the data itself, but accesses it by issuing a function call to an executor. A description of which methods are available to restore the information hidden in the data is always available. A text tells the client which functions are available and what their purposes are. The archiving method is based on using a virtual computer instruction set and saving the algorithm as a program written int hat virtual machine language. For machine instructions to be executed many years later, for example 100 years, an emulator of the original machine would be written on the future hardware. Any machine manufactured in the originating year would develop for each architecture a Universal Virtual Computer (UVC) description of the machine. Each originating instruction would be mapped into a small program of UVC instructions. All manufacturers of new architectures would then have to write a UVC executor which would be able to execute UVC instructions on the machine running 100 years in the future.

    摘要翻译: 通过在可移动介质上归档来保存数字数据。 从长远来看,保存数据位流必须被正确解释。 对于要存档的计算机程序或系统,构成程序的位流必须归档,并且代码必须在还原时可执行。 恢复数据的程序不会“查看”数据本身的内容,而是通过向执行程序发出函数调用来访问它。 始终可以使用可用于恢复数据中隐藏的信息的方法的描述。 一个文本告诉客户哪些功能可用,它们的用途是什么。 归档方法是基于使用虚拟计算机指令集,并将算法保存为程序编写的虚拟机语言。 对于多年后执行的机器指令,例如100年,原始机器的仿真器将被写在未来的硬件上。 在起始年度制造的任何机器将针对每个架构开发机器的通用虚拟计算机(UVC)描述。 每个发起的指令将被映射到一个小程序的UVC指令。 所有新架构的制造商都必须编写一个UVC执行器,以便能够在将来运行100年的机器上执行UVC指令。

    Optical character recognition system having context analyzer
    3.
    发明授权
    Optical character recognition system having context analyzer 失效
    具有上下文分析器的光学字符识别系统

    公开(公告)号:US06577755B1

    公开(公告)日:2003-06-10

    申请号:US08938044

    申请日:1997-09-26

    IPC分类号: G06K900

    CPC分类号: G06K9/726 G06K2209/01

    摘要: An optical character recognition (OCR) system is provided, in which syntactical and semantic rules, provided along with an input image to be scanned and applicable to the contents of the scanned image, are used in connection with the results of the OCR scan to identify the scanned characters. As a result, the recognition rate and confidence are enhanced. By providing the checking based on syntactical and semantic rules within the OCR system, application programs which would receive and use the OCR results are freed from the added burden of having to perform their own syntactical and/or semantic checking on the OCR results the application programs receive from the OCR system.

    摘要翻译: 提供了一种光学字符识别(OCR)系统,其中与用于扫描并适用于扫描图像的内容的输入图像一起提供的语法和语义规则与OCR扫描的结果一起使用以识别 扫描的字符。 结果,增强识别率和置信度。 通过在OCR系统中提供基于句法和语义规则的检查,可以免除接收和使用OCR结果的应用程序不必在OCR上执行自己的语法和/或语义检查结果的应用程序, 从OCR系统接收。

    Verification and correction method and system for optical character
recognition
    4.
    发明授权
    Verification and correction method and system for optical character recognition 失效
    光学字符识别的验证和校正方法及系统

    公开(公告)号:US5933531A

    公开(公告)日:1999-08-03

    申请号:US697380

    申请日:1996-08-23

    摘要: An optical character recognition method and system are provided, employing context analysis and operator input, alternatively and in combination, on the same batch of documents. After automatic character recognition, the context analyzer processes the fields that are good enough to expect resolution. This will accept as many fields as possible without any operator intervention. For some other fields, the process uses operator input to certify the character-level OCR result of, or to enter, a certain percentage of the characters, so that context analysis may accept some of the remaining fields. If the context analyzer successfully identifies a small set of very close hypotheses, the process asks the operator to certify one or two characters to resolve the ambiguity between the hypotheses. For the fields that are still not resolved, the fields and the hypotheses are shown to the operator for acceptance, correction, or entry.

    摘要翻译: 提供了一种光学字符识别方法和系统,其使用上下文分析和操作者输入,并且组合地在同一批文档上。 自动字符识别后,上下文分析器处理足够好的字段来预期分辨率。 这将尽可能接受尽可能多的字段,而无需任何操作员干预。 对于其他一些字段,该过程使用操作员输入来验证字符级的OCR结果或输入一定百分比的字符,以便上下文分析可以接受一些其余字段。 如果上下文分析器成功识别出一组非常接近的假设,则该过程要求操作者证明一个或两个字符来解决假设之间的歧义。 对于尚未解决的字段,字段和假设将显示给操作员以进行验收,更正或输入。