USER CORRECTION OF ERRORS ARISING IN A TEXTUAL DOCUMENT UNDERGOING OPTICAL CHARACTER RECOGNITION (OCR) PROCESS
    1.
    发明申请
    USER CORRECTION OF ERRORS ARISING IN A TEXTUAL DOCUMENT UNDERGOING OPTICAL CHARACTER RECOGNITION (OCR) PROCESS 审中-公开
    用户校正在光学字符识别(OCR)过程中出现的文本文档中的错误

    公开(公告)号:US20110280481A1

    公开(公告)日:2011-11-17

    申请号:US12780991

    申请日:2010-05-17

    IPC分类号: G06K9/03 G06K9/34

    CPC分类号: G06K9/033

    摘要: An electronic model of the image document is created by undergoing an OCR process. The electronic model includes elements (e.g., words, text lines, paragraphs, images) of the image document that have been determined by each of a plurality of sequentially executed stages in the OCR process. The electronic model serves as input information which is supplied to each of the stages by a previous stage that processed the image document. A graphical user interface is presented to the user so that the user can provide user input data correcting a mischaracterized item appearing in the document. Based on the user input data, the processing stage which produced the initial error that gave rise to the mischaracterized item corrects the initial error. Stages of the OCR process subsequent to this stage then correct any consequential errors arising in their respective stages as a result of the initial error.

    摘要翻译: 通过进行OCR过程创建图像文档的电子模型。 电子模型包括由OCR处理中的多个顺序执行阶段中的每一个确定的图像文档的元素(例如,单词,文本行,段落,图像)。 电子模型用作输入信息,该信息由处理图像文档的前一级提供给每个级。 向用户呈现图形用户界面,使得用户可以提供校正出现在文档中的错误描述的项目的用户输入数据。 基于用户输入数据,产生引起错误特征项的初始误差的处理阶段校正初始误差。 在此阶段之后的OCR过程的阶段然后纠正由于初始错误而在其各自阶段中产生的任何后果性错误。

    Detecting position of word breaks in a textual line image
    2.
    发明授权
    Detecting position of word breaks in a textual line image 有权
    检测文字行图像中的分词位置

    公开(公告)号:US08345978B2

    公开(公告)日:2013-01-01

    申请号:US12749599

    申请日:2010-03-30

    IPC分类号: G06K9/00

    摘要: Line segmentation in an OCR process is performed to detect the positions of words within an input textual line image by extracting features from the input to locate breaks and then classifying the breaks into one of two break classes which include inter-word breaks and inter-character breaks. An output including the bounding boxes of the detected words and a probability that a given break belongs to the identified class can then be provided to downstream OCR or other components for post-processing. Advantageously, by reducing line segmentation to the extraction of features, including the position of each break and the number of break features, and break classification, the task of line segmentation is made less complex but with no loss of generality.

    摘要翻译: 执行OCR处理中的线分割以通过从输入中提取特征来定位分组,然后将分组分类成包括字间间隔和字符间的两个断点类之一来检测输入文本行图像内的单词的位置 休息 然后可以将包括检测到的单词的边界框和给定中断属于所识别的类别的概率的输出提供给下游OCR或用于后处理的其他组件。 有利的是,通过将行分割减少到特征的提取,包括每个断点的位置和断裂特征的数量以及断裂分类,线分割的任务变得不那么复杂,但不失一般性。

    Page classifier engine
    3.
    发明授权
    Page classifier engine 有权
    页面分类引擎

    公开(公告)号:US08392816B2

    公开(公告)日:2013-03-05

    申请号:US11949586

    申请日:2007-12-03

    IPC分类号: G06F17/21

    CPC分类号: G06K9/00469

    摘要: Embodiments of the present invention relate to classifying pages of an electronic document, such as a scanned book page. OCR software is applied to the contents of the electronic document, revealing semantic information about the content of the electronic document. Software-based features are applied to the semantic information to determine the type of page the electronic document is. Page types may include table of contents (TOC), table of figures (TOF), bibliography, index, or other types of pages commonly found in a book, magazine, or other publication. Once determined, the determined page type is stored and used by other software engines.

    摘要翻译: 本发明的实施例涉及对诸如扫描书页之类的电子文档的页进行分类。 OCR软件应用于电子文档的内容,揭示关于电子文档内容的语义信息。 基于软件的功能被应用于语义信息以确定电子文档的页面类型。 页面类型可能包括目录(TOC),图表(TOF),参考书目,索引或书籍,杂志或其他出版物中常见的其他类型的页面。 一旦确定,确定的页面类型被其他软件引擎存储和使用。

    DETECTING POSITION OF WORD BREAKS IN A TEXTUAL LINE IMAGE
    4.
    发明申请
    DETECTING POSITION OF WORD BREAKS IN A TEXTUAL LINE IMAGE 有权
    检测文字线图像中的字符位置

    公开(公告)号:US20110243445A1

    公开(公告)日:2011-10-06

    申请号:US12749599

    申请日:2010-03-30

    IPC分类号: G06K9/34 G06K9/18

    摘要: Line segmentation in an OCR process is performed to detect the positions of words within an input textual line image by extracting features from the input to locate breaks and then classifying the breaks into one of two break classes which include inter-word breaks and inter-character breaks. An output including the bounding boxes of the detected words and a probability that a given break belongs to the identified class can then be provided to downstream OCR or other components for post-processing. Advantageously, by reducing line segmentation to the extraction of features, including the position of each break and the number of break features, and break classification, the task of line segmentation is made less complex but with no loss of generality.

    摘要翻译: 执行OCR处理中的线分割以通过从输入中提取特征来定位分组,然后将分组分类成包括字间间隔和字符间的两个断点类之一来检测输入文本行图像内的单词的位置 休息 然后可以将包括检测到的单词的边界框和给定中断属于所识别的类别的概率的输出提供给下游OCR或用于后处理的其他组件。 有利的是,通过将行分割减少到特征的提取,包括每个断点的位置和断裂特征的数量以及断裂分类,线分割的任务变得不那么复杂,但不失一般性。

    Document page segmentation in optical character recognition
    5.
    发明授权
    Document page segmentation in optical character recognition 有权
    光学字符识别中的文档页面分割

    公开(公告)号:US08509534B2

    公开(公告)日:2013-08-13

    申请号:US12720943

    申请日:2010-03-10

    IPC分类号: G06K9/36

    摘要: Page segmentation in an optical character recognition process is performed to detect textual objects and/or image objects. Textual objects in an input gray scale image are detected by selecting candidates for native lines which are sets of horizontally neighboring connected components (i.e., subsets of image pixels where each pixel from the set is connected with all remaining pixels from the set) having similar vertical statistics defined by values of baseline (the line upon which most text characters “sit”) and mean line (the line under which most of the characters “hang”). Binary classification is performed on the native line candidates to classify them as textual or non-textual through examination of any embedded regularity. Image objects are indirectly detected by detecting the image's background using the detected text to define the background. Once the background is detected, what remains (i.e., the non-background) is an image object.

    摘要翻译: 执行光学字符识别处理中的页面分割以检测文本对象和/或图像对象。 通过选择作为水平相邻连接分量的集合(即,来自集合的每个像素与集合中的每个像素与集合中的所有剩余像素连接的图像像素的集合),选择具有相似垂直方向的本机线的候选,来检测输入灰度图像中的文本对象 由基准值(大多数文本字符“坐”的行)和平均线(大多数字符“挂起”的行)定义的统计信息。 对本地候选人执行二进制分类,以便通过审查任何嵌入规律性将其分类为文本或非文本。 通过使用检测到的文本检测图像的背景以定义背景来间接检测图像对象。 一旦检测到背景,剩余的(即非背景)是图像对象。

    DOCUMENT PAGE SEGMENTATION IN OPTICAL CHARACTER RECOGNITION
    6.
    发明申请
    DOCUMENT PAGE SEGMENTATION IN OPTICAL CHARACTER RECOGNITION 有权
    光学字符识别中的文档分页

    公开(公告)号:US20110222769A1

    公开(公告)日:2011-09-15

    申请号:US12720943

    申请日:2010-03-10

    IPC分类号: G06K9/34 G06K9/72

    摘要: Page segmentation in an optical character recognition process is performed to detect textual objects and/or image objects. Textual objects in an input gray scale image are detected by selecting candidates for native lines which are sets of horizontally neighboring connected components (i.e., subsets of image pixels where each pixel from the set is connected with all remaining pixels from the set) having similar vertical statistics defined by values of baseline (the line upon which most text characters “sit”) and mean line (the line under which most of the characters “hang”). Binary classification is performed on the native line candidates to classify them as textual or non-textual through examination of any embedded regularity. Image objects are indirectly detected by detecting the image's background using the detected text to define the background. Once the background is detected, what remains (i.e., the non-background) is an image object.

    摘要翻译: 执行光学字符识别处理中的页面分割以检测文本对象和/或图像对象。 通过选择作为水平相邻连接分量的集合(即,来自集合的每个像素与集合中的每个像素与集合中的所有剩余像素连接的图像像素的集合),选择具有相似垂直方向的本机线的候选,来检测输入灰度图像中的文本对象 由基准值(大多数文本字符“坐”的行)和平均线(大多数字符“挂起”的行)定义的统计信息。 对本地候选人执行二进制分类,以便通过审查任何嵌入规律性将其分类为文本或非文本。 通过使用检测到的文本检测图像的背景以定义背景来间接检测图像对象。 一旦检测到背景,剩余的(即非背景)是图像对象。

    Geometric parsing of mathematical expressions
    7.
    发明申请
    Geometric parsing of mathematical expressions 有权
    几何解析数学表达式

    公开(公告)号:US20080253657A1

    公开(公告)日:2008-10-16

    申请号:US11784889

    申请日:2007-04-10

    IPC分类号: G06K9/18

    CPC分类号: G06K9/00402

    摘要: A processing device may parse a group of strokes representing a mathematical expression. The group of strokes may be examined to determine whether the group of strokes satisfies any of a finite set of rules. When the group of strokes, included in a region, satisfies any of the finite set of rules, the region may be partitioned according to a satisfied one of the finite set of rules. The group of strokes included in the region may be further examined to determine whether the group of strokes may be further partitioned according to any of the finite set of rules. After all regions have been examined and no further partitioning of regions may be performed, all mathematical symbols of the mathematical expression may be isolated in at least some of the regions and may be recognized.

    摘要翻译: 处理设备可以解析表示数学表达式的一组笔划。 可以检查一组笔划以确定笔划组是否满足任何一组有限的规则。 当包括在区域中的笔划组满足任何有限的规则集合时,可以根据有限规则集合中的一个满足区域。 可以进一步检查包括在该区域中的笔划组以确定是否可以根据任何有限规则集进一步划分笔划组。 在检查了所有区域之后,并且不能进行区域的进一步分割,数学表达式的所有数学符号可以在至少一些区域中被隔离并且可被识别。

    Paragraph recognition in an optical character recognition (OCR) process
    8.
    发明授权
    Paragraph recognition in an optical character recognition (OCR) process 有权
    光学字符识别(OCR)过程中的段落识别

    公开(公告)号:US08565474B2

    公开(公告)日:2013-10-22

    申请号:US12720992

    申请日:2010-03-10

    IPC分类号: G06K9/00

    CPC分类号: G06K9/00469 G06K9/00463

    摘要: An image processing apparatus for detecting paragraphs in a textual image includes an input component for receiving an input image in which textual lines and words have been identified and a page classification component for classifying the input image as a first or second page type. The apparatus also includes a paragraph detection component for classifying all textual lines on the input image as a beginning paragraph line or a continuation paragraph line. The apparatus is also provided with a paragraph creation component for creating paragraphs that include textual lines between two successive beginning paragraph lines, including a first of the two successive beginning paragraph lines. The paragraphs that have been identified may be classified by the type of alignment they exhibit. For instance, paragraphs may be classified according to whether they are left aligned, right aligned, center aligned or justified.

    摘要翻译: 用于检测文本图像中的段落的图像处理装置包括用于接收其中已经识别了文本行和单词的输入图像的输入组件和用于将输入图像分类为第一或第二页面类型的页面分类组件。 该装置还包括段落检测部件,用于将输入图像上的所有文本行分类为起始段落线或连续段落线。 该装置还具有段落创建部件,用于创建包括两个连续起始段落线之间的文本行的段落,包括两个连续起始段落行中的第一行。 已确定的段落可以按照它们展示的对齐方式进行分类。 例如,段落可以根据它们是否对齐,右对齐,中心对齐或对齐来进行分类。

    Geometric parsing of mathematical expressions
    9.
    发明授权
    Geometric parsing of mathematical expressions 有权
    几何解析数学表达式

    公开(公告)号:US08064696B2

    公开(公告)日:2011-11-22

    申请号:US11784889

    申请日:2007-04-10

    IPC分类号: G06K9/00

    CPC分类号: G06K9/00402

    摘要: A processing device may parse a group of strokes representing a mathematical expression. The group of strokes may be examined to determine whether the group of strokes satisfies any of a finite set of rules. When the group of strokes, included in a region, satisfies any of the finite set of rules, the region may be partitioned according to a satisfied one of the finite set of rules. The group of strokes included in the region may be further examined to determine whether the group of strokes may be further partitioned according to any of the finite set of rules. After all regions have been examined and no further partitioning of regions may be performed, all mathematical symbols of the mathematical expression may be isolated in at least some of the regions and may be recognized.

    摘要翻译: 处理设备可以解析表示数学表达式的一组笔划。 可以检查一组笔划以确定笔划组是否满足任何一组有限的规则。 当包括在区域中的笔划组满足任何有限的规则集合时,可以根据有限规则集合中的一个满足区域。 可以进一步检查包括在该区域中的笔划组以确定是否可以根据任何有限规则集进一步划分笔划组。 在检查了所有区域之后,并且不能进行区域的进一步分割,数学表达式的所有数学符号可以在至少一些区域中被隔离并且可被识别。

    PARAGRAPH RECOGNITION IN AN OPTICAL CHARACTER RECOGNITION (OCR) PROCESS
    10.
    发明申请
    PARAGRAPH RECOGNITION IN AN OPTICAL CHARACTER RECOGNITION (OCR) PROCESS 有权
    光学识别(OCR)过程中的符号识别

    公开(公告)号:US20110222773A1

    公开(公告)日:2011-09-15

    申请号:US12720992

    申请日:2010-03-10

    IPC分类号: G06K9/18 G06K9/62

    CPC分类号: G06K9/00469 G06K9/00463

    摘要: An image processing apparatus for detecting paragraphs in a textual image includes an input component for receiving an input image in which textual lines and words have been identified and a page classification component for classifying the input image as a first or second page type. The apparatus also includes a paragraph detection component for classifying all textual lines on the input image as a beginning paragraph line or a continuation paragraph line. The apparatus is also provided with a paragraph creation component for creating paragraphs that include textual lines between two successive beginning paragraph lines, including a first of the two successive beginning paragraph lines. The paragraphs that have been identified may be classified by the type of alignment they exhibit. For instance, paragraphs may be classified according to whether they are left aligned, right aligned, center aligned or justified.

    摘要翻译: 用于检测文本图像中的段落的图像处理装置包括用于接收其中已经识别了文本行和单词的输入图像的输入组件和用于将输入图像分类为第一或第二页面类型的页面分类组件。 该装置还包括段落检测部件,用于将输入图像上的所有文本行分类为起始段落线或连续段落线。 该装置还具有段落创建部件,用于创建包括两个连续起始段落线之间的文本行的段落,包括两个连续起始段落行中的第一行。 已确定的段落可以按照它们展示的对齐方式进行分类。 例如,段落可以根据它们是否对齐,右对齐,中心对齐或对齐来进行分类。