Method and apparatus for identifying spoof documents
    1.
    发明授权
    Method and apparatus for identifying spoof documents 有权
    识别欺骗性文件的方法和装置

    公开(公告)号:US06442606B1

    公开(公告)日:2002-08-27

    申请号:US09374315

    申请日:1999-08-12

    IPC分类号: G06F1300

    摘要: A method and apparatus are provided for indexing electronic documents that include one or more visible text portions and one or more non-visible text portions. The method includes the step of identifying an electronic document. Once the electronic document is identified, a set of words is selected from a particular tag type that is associated with one or more non-visible text portions of the electronic document. Each word in the selected set of words is compared with words in the one or more visible text portions of the electronic document. An index word set is then determined for the electronic document based on matches between words in the selected set of words and words in the one or more visible text portions of the electronic document.

    摘要翻译: 提供了一种用于对包括一个或多个可见文本部分和一个或多个不可见文本部分的电子文档进行索引的方法和装置。 该方法包括识别电子文档的步骤。 一旦识别了电子文档,就从与电子文档的一个或多个不可见的文本部分相关联的特定标签类型中选择一组单词。 将所选择的一组单词中的每个单词与电子文档的一个或多个可见文本部分中的单词进行比较。 然后基于电子文档的一个或多个可见文本部分中的所选择的单词和单词中的单词之间的匹配来确定电子文档的索引词组。