Relabelling of tokenized symbols in fontless structured document image representations
    1.
    发明申请
    Relabelling of tokenized symbols in fontless structured document image representations 失效
    在无字体结构化文档图像表示中重新标记符号化符号

    公开(公告)号:US20010043349A1

    公开(公告)日:2001-11-22

    申请号:US09884418

    申请日:2001-06-18

    Abstract: A processor is provided with a first set of digital information that includes a first structured representation of a document. From the first set of digital information, the processor produces a second set of digital information that includes a second structured representation of the document. The second structured representation is a lossless representation and includes a set of tokens and a set of positions. At least one token in the plurality of tokens has an associated semantic label which may be a character code associated with various font types in the second structured representation of the document. The semantic label may be obtained and stored in the second structured representation of the document by a computer program. The first and second representations may be resolution dependent structured representations and have, respectively, first and second characteristic resolutions. The first representation, but not the second, is provided in digital form to an untrusted recipient. A search for particular content of the second representation, including semantic labels, is requested by the recipient. A highlighted version of the first representation of the document is then provided to the recipient.

    Abstract translation: 处理器被提供有包括文档的第一结构化表示的第一组数字信息。 从第一组数字信息,处理器产生包括文档的第二结构化表示的第二组数字信息。 第二种结构化表示法是无损表示,包括一组令牌和一组位置。 多个令牌中的至少一个令牌具有相关联的语义标签,其可以是与文档的第二结构化表示中的各种字体类型相关联的字符代码。 语义标签可以通过计算机程序获得并存储在文档的第二结构化表示中。 第一和第二表示可以是分辨率相关的结构化表示,并且分别具有第一和第二特征分辨率。 第一个表示,但不是第二个,以数字形式提供给不受信任的收件人。 搜索第二表示的特定内容,包括语义标签,由接收者请求。 然后将文档的第一个表示形式的突出显示版本提供给收件人。

Patent Agency Ranking