Method of determining Unicode values corresponding to the text in digital documents
    1.
    发明授权
    Method of determining Unicode values corresponding to the text in digital documents 有权
    确定与数字文档中的文本相对应的Unicode值的方法

    公开(公告)号:US07636885B2

    公开(公告)日:2009-12-22

    申请号:US11447826

    申请日:2006-06-06

    IPC分类号: G06F17/00

    CPC分类号: G06F17/2264 G06F17/2217

    摘要: A method of determining Unicode values corresponding to the text in digital documents includes: providing a digital document containing information related to the text in the document, the information including at least one set of data selected from the group consisting of: the numerical character code comprised by a single byte value or a sequence of multiple bytes, the glyph name corresponding to the character code for simple fonts, the code-to-Unicode mapping provided by a ToUnicode CMap, and font outline data embedded in the document; obtaining the information related to the text from the document; and determining the Unicode values corresponding to a specific code of a specific font on a per-glyph basis by executing a cascade of determination steps for each code separately, the cascade being executed in a predetermined sequence using different sources of information.

    摘要翻译: 确定与数字文档中的文本相对应的Unicode值的方法包括:提供包含与文档中的文本相关的信息的数字文档,该信息包括从由以下组成的组中选择的至少一组数据:包括的数字字符代码 通过单个字节值或多个字节的序列,对应于简单字体的字符代码的字形名称,ToUnicode CMap提供的代码到Unicode映射和嵌入在文档中的字体轮廓数据; 从文件中获取与文本有关的信息; 并且通过分别对每个代码执行级联的确定步骤来确定与每个字形基础上的特定字体的特定代码相对应的Unicode值,所述级联使用不同的信息源按预定顺序执行。