摘要:
A system that intelligently abstracts and archives a document for storage and interprets a free form user retrieval query to recall the document from the storage file. The system includes a method for automatically selecting keywords from the document using a parts of a speech directory. A method is given for weighing the importance or centrality of each keyword with respect to the document of its origin. Using the same logic paths, a free form query that describes the document in the same manner that it would have to be described to a secretary to "find" it in a filing cabinet, the system automatically determines the key matching terms and finds the archived document(s) with the greatest affinity.
摘要:
A system for reducing storage requirements and accessing times in a text processing machine for automatic spelling verification and hyphenation functions. The system includes a method for storing a word list file and accessing the word list file such that legal prefixes and suffixes are truncated and only the unique root element, or "stem", of a word is stored. A set of unique rules is provided for prefix/suffix removal during compilation of the word list file and subsequent accessing of the word list file. Spelling verification is accomplished by applying the rules to the words whose spelling is to be verified and application of the said rules provides, under most circumstances, a natural hyphenation break point at the prefix-stem and stem-suffix junctions.
摘要:
A system for reducing the computation required to match a misspelled word against various candidates from a dictionary to find one or more words that represent the best match to the misspelled word. The major facility offered is the ability to computationally discern the degree of apparent match that exists between words that do not perfectly match a given target word without requiring the computationally tedious procedure of character by character positional matching which necessitates shifting and realignment to accommodate for differences between the candidate and target words due to character differences or added and dropped syllables. The system includes a method for storing and retrieving words from the dictionary based on their likelihood of being the correct version of a misspelled word and then reviewing those words further using the Prescan Alpha Content Match to reduce the number of candidates that must then be examined in a high resolution positional match to find the candidate(s) which matches the mis-spelled word with the greatest character affinity. The Prescan Alpha Content Match reduces the number of candidates in contention so as to make a high resolution match computationally feasible on a real-time basis.
摘要:
A system for reducing the computation required to match a misspelled word against various candidates from a dictionary to find one or more words that represent the best match to the misspelled word. The major facility offered is the ability to computationally discern the degree of apparent match that exists between words that do not perfectly match a given target word without requiring the computationally tedious procedure of character by character positional matching which necessitates shifting and realignment to accommodate for differences between the candidate and target words due to character differences or added and dropped syllables. The system includes a method for storing and retrieving words from the dictionary based on their likelihood of being the correct version of a misspelled word and then reviewing those words further to reduce the number of candidates that must then be examined in a high resolution positional match to find the candidate(s) which matches the misspelled word with the greatest character affinity. This technique reduces the number of candidates in contention so as to make a high resolution match computationally feasible on a real-time basis. The discriminant potential and the real-time computational burden associated with the technique are balanced in an optimal manner.
摘要:
A method and system for compacting text data to be transmitted over communications lines and thereby reduce the data volume and transmission time. Transmitting and receiving text processing systems are provided identical library memories containing text strings such as words commonly used in correspondence. Each word in a document to be communicated is compared to the transmitting system's word library and, if found in the library, only the library address is transmitted. If the word is not found in the library, then it is added to the transmitting system's library, sent, and added to the receiving system's library. The receiving system reconstructs the document by using the received addresses to access the appropriate words from its library and place them in the document. The system combines this word match encoding with character match encoding and facsimile run length encoding for communicating words not found in the system library.
摘要:
Spelling errors in a word processing system are detected and presented to the operator for correction at the end of a document page. A dictionary memory contains representations of the correct spellings for words most frequently used. As each word is typed, it is stored in a word queue where it is compared to the contents of the dictionary memory. If the compare is unequal, then the word and its location on the page are stored in an error memory. When an end of page indicator is set the printer automatically repositions the print head at the ending character of the first word in the error list. When the operator keys in the correct spelling, the printer is caused to remove the misspelled word from the page and type the correct spelling. The corresponding word in the error memory is also corrected. As each misspelled word in the error memory is corrected, the remainder of the memory is scanned and repetitions of the same spelling error are automatically corrected.
摘要:
A data processing system and method for the correction of address information on mail. The method makes use of a contextual predictive keying method for enabling an operator to read the image of an addressee mailing address and type in a minimum number of keystrokes necessary to sort the mail piece down to the final sorting level at the destination post office.
摘要:
A data processing system, method and program are disclosed to optimize mail piece sorting and the mapping of mail down to the carrier walk sequence using real time statistical data. The invention makes use of techniques such as fast OCR devices at a sending location or deferred processing of OCR scanned mail, to accumulate volume statistics indicating the number of mail pieces being routed particular addressees at a destination postal region on a given day. The information for mail volumes being directed to a particular postal region are collected over data communications links prior to the receipt of the actual mail pieces. The efficiency of sorting is maximized at the destination postal region by organizing the sorting apparatus to remove the highest volume addressee's mail first. This requires the compilation of the real time volume statistics from all of the sending postal regions sending mail to the destination postal location. In this manner, the maximum number of letters on every pass through the sorting apparatus can be achieved at the destination location. This minimizes the total number of reading operations required in order to achieve a desired level of mail sorting separation. Because the mail volume statistics are available at the destination location prior to sorting, at each stage of the sorting operations, bin allocation can be customized to yield the highest final patron or addressee sort. In this manner, the time for every subsequent pass through the sorting apparatus is reduced. This enables sorting directly to the addressee level and the distribution of the mail down to carrier walk sequence.
摘要:
A system and method are disclosed for enabling the technique of deferred processing of OCR scanned mail to be compatible with existing techniques for mechanical sortation of mail that use standard sort barcode formats which are common to a given destination postal system. This enables deferred OCR processed mail to be sorted on an unsegregated basis along with other types of mail which have not been processed by the deferred OCR technique. This allows the OCR encoded mail to be processed along with other types of encoded mail during standard sort barcode that has been imprinted using prior technology such as OCR or manual code desks.
摘要:
A data processing method and system are disclosed to provide active pigeon hole sorting for mail pieces in a postal system. The method is based upon the receipt of deferred optical character recognition statistics for mail pieces in transit to a destination postal region. An ordered list of addressees is compiled from the DOCR statistics. From this ordered list, the sorting case for sorting the mail is partitioned to eliminate pigeon holes for those postal recipients not receiving mail on that day. Still further, the pigeon holes in the sorting case are actively indicated with a prompting light to facilitate the operator physically sorting the mail piece down to delivery sequence. The assignment of delivery stops to pigeon holes is also developed so as to designate adjacent pigeon holes based on the carrier walk without regard to street number but rather to reflect geographic juxtaposition.