发明公开
EP0702322A2 Method and apparatus for identifying words described in a portable electronic document 失效
单词的识别,一个便携式电子文档中描述的方法和装置

  • 专利标题: Method and apparatus for identifying words described in a portable electronic document
  • 专利标题(中): 单词的识别,一个便携式电子文档中描述的方法和装置
  • 申请号: EP95303939.3
    申请日: 1995-06-08
  • 公开(公告)号: EP0702322A2
    公开(公告)日: 1996-03-20
  • 发明人: Paknad, Mohammed DaryoushAyers, Robert M.
  • 申请人: ADOBE SYSTEMS INC.
  • 申请人地址: 1585 Charleston Road Mountain View California 94039-7900 US
  • 专利权人: ADOBE SYSTEMS INC.
  • 当前专利权人: ADOBE SYSTEMS INC.
  • 当前专利权人地址: 1585 Charleston Road Mountain View California 94039-7900 US
  • 代理机构: Wombwell, Francis
  • 优先权: US304678 19940912
  • 主分类号: G06K9/68
  • IPC分类号: G06K9/68
Method and apparatus for identifying words described in a portable electronic document
摘要:
A method and apparatus for identifying words stored in a portable electronic document. A digital computation apparatus stores a page of a document including characters in text segments that have not been identified as words. A word identifying mechanism analyzes the text segments of the page and stores the text segments as text objects in a linked list. The word identifying mechanism identifies words from the text objects in the linked list by analyzing the text objects for word breaks and by analyzing gaps between text objects using position data associated with the text segments. The identified words are stored in a word list and are sorted if necessary. A method of the present invention receives a text segment from a page of a document having multiple text segments and associated position data, including x and y coordinates for each text segment. A text object is created for each text segment, and the text objects are entered into a linked list. Words are then identified from the linked list by analyzing the text objects for word breaks and by analyzing gaps between text objects using the associated position data. Words that are identified in the text objects are added to a word list. The above steps are repeated until the end of the page is reached. The method and apparatus can be used for searching for words in a portable electronic document.
信息查询
0/0