Identifying a selection of content in a structured document
    2.
    发明授权
    Identifying a selection of content in a structured document 有权
    识别结构化文档中的内容选择

    公开(公告)号:US08549399B2

    公开(公告)日:2013-10-01

    申请号:US13109918

    申请日:2011-05-17

    IPC分类号: G06F17/27

    摘要: For a document with content that has been structured into a set primitive areas, a novel method for performing contiguous selection of document content across different primitive areas in the document is disclosed. The method defines a contiguous section in the ordered list by identifying the first and last primitive elements of the contiguous selection. The first primitive element is identified as the primitive element that is closest in reading flow to a start selection point on the page, while the last primitive element is identified as the primitive element that is closest in reading flow to an end selection point on the page.

    摘要翻译: 对于具有已经被构造成集合原始区域的内容的文档,公开了一种用于在文档中的不同原始区域执行连续选择文档内容的新颖方法。 该方法通过识别连续选择的第一个和最后一个原始元素来定义有序列表中的连续部分。 第一个原始元素被识别为在页面上的开始选择点读取流中最接近的原始元素,而最后一个元素被识别为在页面中的最终选择点的读取流中最接近的元素元素 。

    Reconstruction of Lists in a Document
    3.
    发明申请
    Reconstruction of Lists in a Document 有权
    文件清单重建

    公开(公告)号:US20120185491A1

    公开(公告)日:2012-07-19

    申请号:US13106806

    申请日:2011-05-12

    IPC分类号: G06F17/30

    摘要: Some embodiments provide a method for analyzing a document that includes several primitive elements. The method identifies that a set of primitive elements include an implicit list in the document based on location and appearance of the set of primitive elements. The method defines the identified implicit list as an explicit list. The method stores the explicit list as a structure associated with the document.

    摘要翻译: 一些实施例提供了一种用于分析包含若干基元的文档的方法。 该方法基于原始元素集合的位置和外观来识别一组原始元素包括文档中的隐式列表。 该方法将识别的隐式列表定义为显式列表。 该方法将显式列表存储为与文档相关联的结构。

    Adaptive Graphic Objects
    4.
    发明申请
    Adaptive Graphic Objects 有权
    自适应图形对象

    公开(公告)号:US20120182317A1

    公开(公告)日:2012-07-19

    申请号:US13106803

    申请日:2011-05-12

    IPC分类号: G09G5/00

    CPC分类号: G06T3/00 G06T3/0006

    摘要: Some embodiments provide a method that defines a group of associated graphic objects for display on a display device. The method defines a set of operations to perform on the associated graphic objects in a particular order. The operations include one or more transforms applied to at least one of the graphic objects. For each particular transform applied to a set of the graphic objects, each graphic object in the set has a set of parameters indicating whether the graphic object is affected by each of a set of primitive transforms of the particular transform. The method stores the set of associated graphic objects and set of operations as a single graphic object.

    摘要翻译: 一些实施例提供了定义用于在显示设备上显示的一组相关联的图形对象的方法。 该方法定义了以特定顺序对关联的图形对象执行的一组操作。 操作包括应用于至少一个图形对象的一个​​或多个变换。 对于应用于一组图形对象的每个特定变换,集合中的每个图形对象具有指示图形对象是否受特定变换的一组原始变换中的每一个影响的一组参数。 该方法将一组关联的图形对象和一组操作存储为单个图形对象。

    Identification of Guides and Gutters of a Document
    5.
    发明申请
    Identification of Guides and Gutters of a Document 有权
    识别文件的指南和沟槽

    公开(公告)号:US20100174978A1

    公开(公告)日:2010-07-08

    申请号:US12479847

    申请日:2009-06-07

    IPC分类号: G06F17/00

    摘要: Some embodiments provide a method for analyzing an unstructured document that includes a number of words. Each word is an associated set of glyphs and each glyph has location coordinates. The method identifies clusters of words based on the location coordinates. Based on the identified clusters, the method defines a set of boundary elements for the glyphs that identify a set of borders for the glyphs. The method defines a structured document for the unstructured document based on the glyphs and the defined boundary elements. To identify clusters of words, the method orders the location coordinates and identifies several partitions of the location coordinates. Each partition specifies a particular grouping of the coordinates into subsets. For each partition, the method identifies a particular set of subsets of location values that satisfy a particular set of constraints and determines a set of subsets of location values that optimizes a particular measure.

    摘要翻译: 一些实施例提供了一种用于分析包括多个单词的非结构化文档的方法。 每个单词都是一组关联的字形,每个字形都具有位置坐标。 该方法基于位置坐标来识别词群。 基于所识别的集群,该方法定义了用于标识字形的一组边框的字形的一组边界元素。 该方法基于字形和定义的边界元素定义非结构化文档的结构化文档。 为了识别单词群集,该方法命令位置坐标并标识位置坐标的几个分区。 每个分区将坐标的特定分组指定为子集。 对于每个分区,该方法识别满足特定的约束集合的位置值子集的特定集合,并且确定优化特定度量的位置值子集的集合。

    Ordering document content based on reading flow
    7.
    发明授权
    Ordering document content based on reading flow 有权
    基于阅读流程订购文档内容

    公开(公告)号:US08543911B2

    公开(公告)日:2013-09-24

    申请号:US13109921

    申请日:2011-05-17

    IPC分类号: G06F17/00

    摘要: For a page that has been decomposed into a set of primitive areas, a novel method for organizing the set of primitive areas into an ordered list is disclosed. The primitive areas in the ordered list are initially sorted using start point order relation ordering, which compares the start points of the primitive areas in the coordinate system of the page. The ordering of the primitive areas in the ordered list are then refined by using contextual order relation ordering, which compares primitive areas against each other according to coordinate systems local to the primitive areas being compared. A new ordered list is then created by transposing primitive areas that are incorrectly ordered according to contextual order relation ordering.

    摘要翻译: 对于已被分解为一组原始区域的页面,公开了一种用于将一组原始区域组织成有序列表的新颖方法。 有序列表中的原始区域最初使用起点顺序关系排序进行排序,该排序比较了页面坐标系中原始区域的起始点。 然后通过使用上下文顺序关系排序来改进有序列表中的原始区域的顺序,该顺序关系排序根据被比较的原始区域的本地坐标系将原始区域相互比较。 然后通过根据上下文顺序关系排序来转置未正确排序的原始区域来创建新的有序列表。

    Content profiling to dynamically configure content processing
    8.
    发明授权
    Content profiling to dynamically configure content processing 有权
    内容分析来动态配置内容处理

    公开(公告)号:US08473467B2

    公开(公告)日:2013-06-25

    申请号:US12479852

    申请日:2009-06-07

    IPC分类号: G06F17/30

    摘要: Some embodiments provide a method that receives an unstructured document including a number of primitive elements. The method identifies a default set of document reconstruction operations for reconstructing the unstructured document to define a structured document. The method performs at least one of the document reconstruction operations from the default set. Based on results of the performed document reconstruction operations, the method identifies a profile for the unstructured document. The method modifies the set of document reconstruction operations for reconstructing the unstructured document according to the identified profile.

    摘要翻译: 一些实施例提供了一种接收包括多个原始元素的非结构化文档的方法。 该方法识别用于重建非结构化文档以定义结构化文档的默认文档重建操作集合。 该方法从默认集执行至少一个文档重建操作。 基于执行的文档重建操作的结果,该方法识别非结构化文档的简档。 该方法根据所识别的简档修改用于重构非结构化文档的文档重建操作的集合。