Document page segmentation in optical character recognition
    1.
    发明授权
    Document page segmentation in optical character recognition 有权
    光学字符识别中的文档页面分割

    公开(公告)号:US08509534B2

    公开(公告)日:2013-08-13

    申请号:US12720943

    申请日:2010-03-10

    IPC分类号: G06K9/36

    摘要: Page segmentation in an optical character recognition process is performed to detect textual objects and/or image objects. Textual objects in an input gray scale image are detected by selecting candidates for native lines which are sets of horizontally neighboring connected components (i.e., subsets of image pixels where each pixel from the set is connected with all remaining pixels from the set) having similar vertical statistics defined by values of baseline (the line upon which most text characters “sit”) and mean line (the line under which most of the characters “hang”). Binary classification is performed on the native line candidates to classify them as textual or non-textual through examination of any embedded regularity. Image objects are indirectly detected by detecting the image's background using the detected text to define the background. Once the background is detected, what remains (i.e., the non-background) is an image object.

    摘要翻译: 执行光学字符识别处理中的页面分割以检测文本对象和/或图像对象。 通过选择作为水平相邻连接分量的集合(即,来自集合的每个像素与集合中的每个像素与集合中的所有剩余像素连接的图像像素的集合),选择具有相似垂直方向的本机线的候选,来检测输入灰度图像中的文本对象 由基准值(大多数文本字符“坐”的行)和平均线(大多数字符“挂起”的行)定义的统计信息。 对本地候选人执行二进制分类,以便通过审查任何嵌入规律性将其分类为文本或非文本。 通过使用检测到的文本检测图像的背景以定义背景来间接检测图像对象。 一旦检测到背景,剩余的(即非背景)是图像对象。

    USER CORRECTION OF ERRORS ARISING IN A TEXTUAL DOCUMENT UNDERGOING OPTICAL CHARACTER RECOGNITION (OCR) PROCESS
    2.
    发明申请
    USER CORRECTION OF ERRORS ARISING IN A TEXTUAL DOCUMENT UNDERGOING OPTICAL CHARACTER RECOGNITION (OCR) PROCESS 审中-公开
    用户校正在光学字符识别(OCR)过程中出现的文本文档中的错误

    公开(公告)号:US20110280481A1

    公开(公告)日:2011-11-17

    申请号:US12780991

    申请日:2010-05-17

    IPC分类号: G06K9/03 G06K9/34

    CPC分类号: G06K9/033

    摘要: An electronic model of the image document is created by undergoing an OCR process. The electronic model includes elements (e.g., words, text lines, paragraphs, images) of the image document that have been determined by each of a plurality of sequentially executed stages in the OCR process. The electronic model serves as input information which is supplied to each of the stages by a previous stage that processed the image document. A graphical user interface is presented to the user so that the user can provide user input data correcting a mischaracterized item appearing in the document. Based on the user input data, the processing stage which produced the initial error that gave rise to the mischaracterized item corrects the initial error. Stages of the OCR process subsequent to this stage then correct any consequential errors arising in their respective stages as a result of the initial error.

    摘要翻译: 通过进行OCR过程创建图像文档的电子模型。 电子模型包括由OCR处理中的多个顺序执行阶段中的每一个确定的图像文档的元素(例如,单词,文本行,段落,图像)。 电子模型用作输入信息,该信息由处理图像文档的前一级提供给每个级。 向用户呈现图形用户界面,使得用户可以提供校正出现在文档中的错误描述的项目的用户输入数据。 基于用户输入数据,产生引起错误特征项的初始误差的处理阶段校正初始误差。 在此阶段之后的OCR过程的阶段然后纠正由于初始错误而在其各自阶段中产生的任何后果性错误。

    DOCUMENT PAGE SEGMENTATION IN OPTICAL CHARACTER RECOGNITION
    3.
    发明申请
    DOCUMENT PAGE SEGMENTATION IN OPTICAL CHARACTER RECOGNITION 有权
    光学字符识别中的文档分页

    公开(公告)号:US20110222769A1

    公开(公告)日:2011-09-15

    申请号:US12720943

    申请日:2010-03-10

    IPC分类号: G06K9/34 G06K9/72

    摘要: Page segmentation in an optical character recognition process is performed to detect textual objects and/or image objects. Textual objects in an input gray scale image are detected by selecting candidates for native lines which are sets of horizontally neighboring connected components (i.e., subsets of image pixels where each pixel from the set is connected with all remaining pixels from the set) having similar vertical statistics defined by values of baseline (the line upon which most text characters “sit”) and mean line (the line under which most of the characters “hang”). Binary classification is performed on the native line candidates to classify them as textual or non-textual through examination of any embedded regularity. Image objects are indirectly detected by detecting the image's background using the detected text to define the background. Once the background is detected, what remains (i.e., the non-background) is an image object.

    摘要翻译: 执行光学字符识别处理中的页面分割以检测文本对象和/或图像对象。 通过选择作为水平相邻连接分量的集合(即,来自集合的每个像素与集合中的每个像素与集合中的所有剩余像素连接的图像像素的集合),选择具有相似垂直方向的本机线的候选,来检测输入灰度图像中的文本对象 由基准值(大多数文本字符“坐”的行)和平均线(大多数字符“挂起”的行)定义的统计信息。 对本地候选人执行二进制分类,以便通过审查任何嵌入规律性将其分类为文本或非文本。 通过使用检测到的文本检测图像的背景以定义背景来间接检测图像对象。 一旦检测到背景,剩余的(即非背景)是图像对象。