Document layout extraction
    1.
    发明授权
    Document layout extraction 有权
    文件布局提取

    公开(公告)号:US08250469B2

    公开(公告)日:2012-08-21

    申请号:US11949537

    申请日:2007-12-03

    IPC分类号: G06F17/00

    摘要: Computer-readable media, systems, and methods for document layout extraction are described. In embodiments, textual data in an electronic format is received and the textual data is converted from the electronic format to an independent interface format, the independent interface format including coordinates to one or more structural elements of the textual data. Further, in embodiments, a structure and layout analysis of the textual data is performed to generate a set of structure and layout information. Still further, in embodiments, the textual data and the set of structure and layout information is stored in an enriched interface format, the enriched interface format providing for search and navigation of the textual data.

    摘要翻译: 描述了用于文档布局提取的计算机可读介质,系统和方法。 在实施例中,接收电子格式的文本数据,并将文本数据从电子格式转换为独立接口格式,独立接口格式包括对文本数据的一个或多个结构元素的坐标。 此外,在实施例中,执行文本数据的结构和布局分析以生成一组结构和布局信息。 此外,在实施例中,文本数据和结构和布局信息集合以丰富的界面格式存储,提供用于文本数据的搜索和导航的丰富的界面格式。

    Page classifier engine
    2.
    发明授权
    Page classifier engine 有权
    页面分类引擎

    公开(公告)号:US08392816B2

    公开(公告)日:2013-03-05

    申请号:US11949586

    申请日:2007-12-03

    IPC分类号: G06F17/21

    CPC分类号: G06K9/00469

    摘要: Embodiments of the present invention relate to classifying pages of an electronic document, such as a scanned book page. OCR software is applied to the contents of the electronic document, revealing semantic information about the content of the electronic document. Software-based features are applied to the semantic information to determine the type of page the electronic document is. Page types may include table of contents (TOC), table of figures (TOF), bibliography, index, or other types of pages commonly found in a book, magazine, or other publication. Once determined, the determined page type is stored and used by other software engines.

    摘要翻译: 本发明的实施例涉及对诸如扫描书页之类的电子文档的页进行分类。 OCR软件应用于电子文档的内容,揭示关于电子文档内容的语义信息。 基于软件的功能被应用于语义信息以确定电子文档的页面类型。 页面类型可能包括目录(TOC),图表(TOF),参考书目,索引或书籍,杂志或其他出版物中常见的其他类型的页面。 一旦确定,确定的页面类型被其他软件引擎存储和使用。

    USER CORRECTION OF ERRORS ARISING IN A TEXTUAL DOCUMENT UNDERGOING OPTICAL CHARACTER RECOGNITION (OCR) PROCESS
    3.
    发明申请
    USER CORRECTION OF ERRORS ARISING IN A TEXTUAL DOCUMENT UNDERGOING OPTICAL CHARACTER RECOGNITION (OCR) PROCESS 审中-公开
    用户校正在光学字符识别(OCR)过程中出现的文本文档中的错误

    公开(公告)号:US20110280481A1

    公开(公告)日:2011-11-17

    申请号:US12780991

    申请日:2010-05-17

    IPC分类号: G06K9/03 G06K9/34

    CPC分类号: G06K9/033

    摘要: An electronic model of the image document is created by undergoing an OCR process. The electronic model includes elements (e.g., words, text lines, paragraphs, images) of the image document that have been determined by each of a plurality of sequentially executed stages in the OCR process. The electronic model serves as input information which is supplied to each of the stages by a previous stage that processed the image document. A graphical user interface is presented to the user so that the user can provide user input data correcting a mischaracterized item appearing in the document. Based on the user input data, the processing stage which produced the initial error that gave rise to the mischaracterized item corrects the initial error. Stages of the OCR process subsequent to this stage then correct any consequential errors arising in their respective stages as a result of the initial error.

    摘要翻译: 通过进行OCR过程创建图像文档的电子模型。 电子模型包括由OCR处理中的多个顺序执行阶段中的每一个确定的图像文档的元素(例如,单词,文本行,段落,图像)。 电子模型用作输入信息,该信息由处理图像文档的前一级提供给每个级。 向用户呈现图形用户界面,使得用户可以提供校正出现在文档中的错误描述的项目的用户输入数据。 基于用户输入数据,产生引起错误特征项的初始误差的处理阶段校正初始误差。 在此阶段之后的OCR过程的阶段然后纠正由于初始错误而在其各自阶段中产生的任何后果性错误。

    DETECTING POSITION OF WORD BREAKS IN A TEXTUAL LINE IMAGE
    4.
    发明申请
    DETECTING POSITION OF WORD BREAKS IN A TEXTUAL LINE IMAGE 有权
    检测文字线图像中的字符位置

    公开(公告)号:US20110243445A1

    公开(公告)日:2011-10-06

    申请号:US12749599

    申请日:2010-03-30

    IPC分类号: G06K9/34 G06K9/18

    摘要: Line segmentation in an OCR process is performed to detect the positions of words within an input textual line image by extracting features from the input to locate breaks and then classifying the breaks into one of two break classes which include inter-word breaks and inter-character breaks. An output including the bounding boxes of the detected words and a probability that a given break belongs to the identified class can then be provided to downstream OCR or other components for post-processing. Advantageously, by reducing line segmentation to the extraction of features, including the position of each break and the number of break features, and break classification, the task of line segmentation is made less complex but with no loss of generality.

    摘要翻译: 执行OCR处理中的线分割以通过从输入中提取特征来定位分组,然后将分组分类成包括字间间隔和字符间的两个断点类之一来检测输入文本行图像内的单词的位置 休息 然后可以将包括检测到的单词的边界框和给定中断属于所识别的类别的概率的输出提供给下游OCR或用于后处理的其他组件。 有利的是,通过将行分割减少到特征的提取,包括每个断点的位置和断裂特征的数量以及断裂分类,线分割的任务变得不那么复杂,但不失一般性。

    Detecting position of word breaks in a textual line image
    5.
    发明授权
    Detecting position of word breaks in a textual line image 有权
    检测文字行图像中的分词位置

    公开(公告)号:US08345978B2

    公开(公告)日:2013-01-01

    申请号:US12749599

    申请日:2010-03-30

    IPC分类号: G06K9/00

    摘要: Line segmentation in an OCR process is performed to detect the positions of words within an input textual line image by extracting features from the input to locate breaks and then classifying the breaks into one of two break classes which include inter-word breaks and inter-character breaks. An output including the bounding boxes of the detected words and a probability that a given break belongs to the identified class can then be provided to downstream OCR or other components for post-processing. Advantageously, by reducing line segmentation to the extraction of features, including the position of each break and the number of break features, and break classification, the task of line segmentation is made less complex but with no loss of generality.

    摘要翻译: 执行OCR处理中的线分割以通过从输入中提取特征来定位分组,然后将分组分类成包括字间间隔和字符间的两个断点类之一来检测输入文本行图像内的单词的位置 休息 然后可以将包括检测到的单词的边界框和给定中断属于所识别的类别的概率的输出提供给下游OCR或用于后处理的其他组件。 有利的是,通过将行分割减少到特征的提取,包括每个断点的位置和断裂特征的数量以及断裂分类,线分割的任务变得不那么复杂,但不失一般性。

    Multi-finger detection and component resolution
    7.
    发明授权
    Multi-finger detection and component resolution 有权
    多指检测和组件分辨率

    公开(公告)号:US08913019B2

    公开(公告)日:2014-12-16

    申请号:US13183377

    申请日:2011-07-14

    IPC分类号: G06F3/041

    CPC分类号: G06F3/0416 G06F2203/04104

    摘要: In embodiments of multi-finger detection and component resolution, touch input sensor data is recognized as a component of a multi-finger gesture on a touch-screen display. An ellipse is determined that approximately encompasses the component, and the ellipse has a primary axis and a secondary axis that are orthogonal. A distribution is then generated that projects sensor data elements from the primary axis based on detected intensity of the touch input sensor data. A histogram function can then be generated based on the distribution, where the histogram function indicates individual contacts of the component and separation of the individual contacts.

    摘要翻译: 在多指检测和分量分辨率的实施例中,触摸输入传感器数据被识别为触摸屏显示器上的多手指手势的分量。 确定大致包含该部件的椭圆,并且该椭圆具有正交的主轴和次轴。 然后根据检测到的触摸输入传感器数据的强度生成从主轴投影传感器数据元素的分布。 然后可以基于分布生成直方图函数,其中直方图功能指示组件的单独接触和单独接触件的分离。