Method and apparatus for identifying words described in a portable electronic document
    1.
    发明公开
    Method and apparatus for identifying words described in a portable electronic document 失效
    单词的识别,一个便携式电子文档中描述的方法和装置

    公开(公告)号:EP0702322A2

    公开(公告)日:1996-03-20

    申请号:EP95303939.3

    申请日:1995-06-08

    IPC分类号: G06K9/68

    CPC分类号: G06K9/00463

    摘要: A method and apparatus for identifying words stored in a portable electronic document. A digital computation apparatus stores a page of a document including characters in text segments that have not been identified as words. A word identifying mechanism analyzes the text segments of the page and stores the text segments as text objects in a linked list. The word identifying mechanism identifies words from the text objects in the linked list by analyzing the text objects for word breaks and by analyzing gaps between text objects using position data associated with the text segments. The identified words are stored in a word list and are sorted if necessary. A method of the present invention receives a text segment from a page of a document having multiple text segments and associated position data, including x and y coordinates for each text segment. A text object is created for each text segment, and the text objects are entered into a linked list. Words are then identified from the linked list by analyzing the text objects for word breaks and by analyzing gaps between text objects using the associated position data. Words that are identified in the text objects are added to a word list. The above steps are repeated until the end of the page is reached. The method and apparatus can be used for searching for words in a portable electronic document.

    Method and apparatus for identifying words described in a page description language file
    4.
    发明公开
    Method and apparatus for identifying words described in a page description language file 失效
    的方法和装置的字的识别被编码在文件中以页面描述语言

    公开(公告)号:EP0701223A3

    公开(公告)日:1997-05-28

    申请号:EP95305330.3

    申请日:1995-07-31

    发明人: Ayers, Robert M.

    IPC分类号: G06K9/20

    CPC分类号: G06K9/00463

    摘要: A method and apparatus for identifying words described in a page description file. A computer device stores a page description language file which includes characters that have not been identified as words by the page description language. A word identifying mechanism reads the page description language file and groups characters to form at least one word from the characters. The system preferably transfers words to a client process capable of processing words at a request of the client process. In a method for identifying words from a page description file, characters are read from the file and are stored in a word buffer until a word break is detected based upon character position data stored in the file. The contents of the word buffer are then provided to a client process as an identified word. The method can also sort the characters from the file into a display order prior to storing the characters in the word buffer. The method and apparatus can be used for searching for words in a page description file.

    摘要翻译: 一种用于识别在页面描述文件中所描述的话的方法和设备。 计算机设备存储的页面描述语言文件,其中包括人物thathave没有被确定为页面描述语言的话。 鉴定机构的字读的页面描述语言文件和组字符,形成从人物至少一个词。 该系统优选地在客户端进程的请求传送话能够处理话客户端过程。 在用于从一个页面描述文件识别字的方法,字符被从文件中读取并且直到基于存储在文件中的字符位置的数据中检测一个字断裂被存储在字缓冲器。 然后字缓冲区的内容被提供给客户端进程以识别的词。 因此,该方法可以从文件中的字符分类到之前在字缓冲器中存储的字符的displayorder。 该方法和装置可以被用于搜索以页面描述文件中的单词。

    Method and apparatus for identifying words described in a portable electronic document
    6.
    发明公开
    Method and apparatus for identifying words described in a portable electronic document 失效
    单词的识别,一个便携式电子文档中描述的方法和装置

    公开(公告)号:EP0702322A3

    公开(公告)日:1997-06-04

    申请号:EP95303939.3

    申请日:1995-06-08

    IPC分类号: G06K9/68

    CPC分类号: G06K9/00463

    摘要: A method and apparatus for identifying words stored in a portable electronic document. A digital computation apparatus stores a page of a document including characters in text segments that have not been identified as words. A word identifying mechanism analyzes the text segments of the page and stores the text segments as text objects in a linked list. The word identifying mechanism identifies words from the text objects in the linked list by analyzing the text objects for word breaks and by analyzing gaps between text objects using position data associated with the text segments. The identified words are stored in a word list and are sorted if necessary. A method of the present invention receives a text segment from a page of a document having multiple text segments and associated position data, including x and y coordinates for each text segment. A text object is created for each text segment, and the text objects are entered into a linked list. Words are then identified from the linked list by analyzing the text objects for word breaks and by analyzing gaps between text objects using the associated position data. Words that are identified in the text objects are added to a word list. The above steps are repeated until the end of the page is reached. The method and apparatus can be used for searching for words in a portable electronic document.

    摘要翻译: 一种用于识别存储在便携式电子文档中的单词的方法和设备。 一种数字计算装置存储的文件包含在文本段的字符thathave未被识别为单词的页面。 识别机构字分析网页的文本段和存储文本段作为链接列表的文本对象。 识别机构字通过分析单词断文本对象和通过分析使用与文本段相关联的位置数据的文本对象之间的间隙从标识在链接列表中的文本对象的话。 标识的字存储在一个单词列表,如果有必要进行排序。 本发明的方法,从具有多个文本段和相关联的位置数据,包括x和y坐标为每个文本段的文档的页接收到文本段。 的文本对象被用于每个文本片段创建的,而文本对象被输入到一个链表。 字,然后从链表通过分析断字的文本对象,并通过使用相关联的位置数据分析的文本对象之间查明的差距。 话并在文中鉴定的对象被添加到单词列表。 重复上述步骤,直到达到页的末尾。 该方法和装置可用于便携式电子文档中搜索字。

    Method and apparatus for identifying words described in a page description language file
    7.
    发明公开
    Method and apparatus for identifying words described in a page description language file 失效
    的方法和装置的字的识别被编码在文件中以页面描述语言

    公开(公告)号:EP0701223A2

    公开(公告)日:1996-03-13

    申请号:EP95305330.3

    申请日:1995-07-31

    发明人: Ayers, Robert M.

    IPC分类号: G06K9/20

    CPC分类号: G06K9/00463

    摘要: A method and apparatus for identifying words described in a page description file. A computer device stores a page description language file which includes characters that have not been identified as words by the page description language. A word identifying mechanism reads the page description language file and groups characters to form at least one word from the characters. The system preferably transfers words to a client process capable of processing words at a request of the client process. In a method for identifying words from a page description file, characters are read from the file and are stored in a word buffer until a word break is detected based upon character position data stored in the file. The contents of the word buffer are then provided to a client process as an identified word. The method can also sort the characters from the file into a display order prior to storing the characters in the word buffer. The method and apparatus can be used for searching for words in a page description file.