摘要:
A method and apparatus for identifying words stored in a portable electronic document. A digital computation apparatus stores a page of a document including characters in text segments that have not been identified as words. A word identifying mechanism analyzes the text segments of the page and stores the text segments as text objects in a linked list. The word identifying mechanism identifies words from the text objects in the linked list by analyzing the text objects for word breaks and by analyzing gaps between text objects using position data associated with the text segments. The identified words are stored in a word list and are sorted if necessary. A method of the present invention receives a text segment from a page of a document having multiple text segments and associated position data, including x and y coordinates for each text segment. A text object is created for each text segment, and the text objects are entered into a linked list. Words are then identified from the linked list by analyzing the text objects for word breaks and by analyzing gaps between text objects using the associated position data. Words that are identified in the text objects are added to a word list. The above steps are repeated until the end of the page is reached. The method and apparatus can be used for searching for words in a portable electronic document.
摘要:
Electronic document version management for multiple versions of an electronic document displays the nature of the changes made between versions of an electronic document.
摘要:
A method and apparatus for identifying words described in a page description file. A computer device stores a page description language file which includes characters that have not been identified as words by the page description language. A word identifying mechanism reads the page description language file and groups characters to form at least one word from the characters. The system preferably transfers words to a client process capable of processing words at a request of the client process. In a method for identifying words from a page description file, characters are read from the file and are stored in a word buffer until a word break is detected based upon character position data stored in the file. The contents of the word buffer are then provided to a client process as an identified word. The method can also sort the characters from the file into a display order prior to storing the characters in the word buffer. The method and apparatus can be used for searching for words in a page description file.
摘要:
Electronic document version management for multiple versions of an electronic document displays the nature of the changes made between versions of an electronic document.
摘要:
A method and apparatus for identifying words stored in a portable electronic document. A digital computation apparatus stores a page of a document including characters in text segments that have not been identified as words. A word identifying mechanism analyzes the text segments of the page and stores the text segments as text objects in a linked list. The word identifying mechanism identifies words from the text objects in the linked list by analyzing the text objects for word breaks and by analyzing gaps between text objects using position data associated with the text segments. The identified words are stored in a word list and are sorted if necessary. A method of the present invention receives a text segment from a page of a document having multiple text segments and associated position data, including x and y coordinates for each text segment. A text object is created for each text segment, and the text objects are entered into a linked list. Words are then identified from the linked list by analyzing the text objects for word breaks and by analyzing gaps between text objects using the associated position data. Words that are identified in the text objects are added to a word list. The above steps are repeated until the end of the page is reached. The method and apparatus can be used for searching for words in a portable electronic document.
摘要:
A method and apparatus for identifying words described in a page description file. A computer device stores a page description language file which includes characters that have not been identified as words by the page description language. A word identifying mechanism reads the page description language file and groups characters to form at least one word from the characters. The system preferably transfers words to a client process capable of processing words at a request of the client process. In a method for identifying words from a page description file, characters are read from the file and are stored in a word buffer until a word break is detected based upon character position data stored in the file. The contents of the word buffer are then provided to a client process as an identified word. The method can also sort the characters from the file into a display order prior to storing the characters in the word buffer. The method and apparatus can be used for searching for words in a page description file.