Abstract:
A method and apparatus is disclosed for making predictions about entities represented in documents and for information analysis of text documents or the like, from a large number of such documents. Predictive models (80) are executed responsive to variables (70) derived from canonical documents (60) to determine documents containing desired attributes or characteristics. The canonical documents are derived from standardized documents (30), which in turn are derived from original documents.
Abstract:
An object-based, semantic representation (277) for documents as information containers, using a controlled taxonomy, facilitates the extraction of meaning from such information containers to provide high-level, automated document interpretation. The high-level functions that are enabled include automated filtering of an information container in accordance with the controlled taxonomy and a set of conditions, to produce a result having only those information objects that are applicable under the specified set of conditions. These functions further include automated combination of information objects which comprise the information containers, to build a composite information container that reflects combined meaning of the associated documents, and automated handling of references from one information container to another.
Abstract:
A summary of an input document is generated by extracting at least one sentence from the document and parsing the extracted sentences into components, such as in a parse tree (110). Sentence reduction processing is performed to mark components which can be removed from the parse trees (135). Sentence reduction can include context importance processing, probabilistic processing, and linguistic knowledge based processing, probabilistic processing includes identifying sentence combination operations and establishing rules for applying the sentence combination operations to mark the parse trees to merge at least two sentences (140). Sentence combination processing also provides a paste operation to operate on the marked components to effect the indicated removal and combination of sentence components, thereby generating summary sentences for the input document.
Abstract:
Waiting prior to engaging an automated service, for enhancement thereof, is disclosed. In one embodiment, a computer-implemented method first determines an automated service to be performed. The method waits a predetermined time between a minimum time and a maximum time, before performing the automated service. In one embodiment, the method determines the predetermined time by performing a statistical regression as to the predetermined time that should be waited based on a length of a received text.
Abstract:
A "domain-general" method for topical segmentation of a document input includes the steps of: extracting one or more selected terms from a document; linking occurrences of the extracted terms based upon the proximity of similar terms; and assigning weighted scores to paragraphs of the document input corresponding to the linked occurrences. In accordance with the present invention, the values of the assigned scores depend upon the type of the selected terms, e.g., common noun, proper noun, pronominal, and the position of the linked occurrences with respect to the paragraphs, e.g., front, during, rear, etc. Upon zero-sum normalization, the assigned scores represent the boundaries of the topical segments of the document input.
Abstract:
A multilingual software application for controlling vehicle servicing equipment includes a plurality of dynamic link libraries corresponding to supported national languages. The multilingual application is configured to receive input selecting one of the supported national languages, load the corresponding dynamic link library, and redisplay during run-time operating instructions for the vehicle servicing equipment in the selected national language. An international language management system is provided to develop, maintain, and synchronize the dynamic link libraries based on processing resource files from which the dynamic link libraries are compiled.
Abstract:
A system and method for interacting with a computer using utterances, speech processing and natural language processing. The system comprises a speech processor for searching a first grammar file for a matching phrase for the utterance, and for searching a second grammar file for the matching phrase if the matching phrase is not found in the first grammar file. The system also includes a natural language processor for searching a database for a matching entry for the matching phrase; and an application interface for performing an action associated with the matching entry if the matching entry is found in the database. The system utilizes context-specific grammars, thereby enhancing speech recognition and natural language processing efficiency. Additionally, the system adaptively and interactively "learns" words and phrases, and their associated meanings.
Abstract:
A method and apparatus are disclosed for comparing an input or query file to a set of files to detect similarities between the query file and the set of files, and digitally shredding files that match, to some degree, the query file and doing so from within the comparison feature. Using a comparison program, the query file is compared with each non-query file in a data processing system, ranging from a stand-alone computer to an enterprise computing network. A list of non-query files having some degree of similarity with the query file is compiled and presented to the user via a user interface within the comparison program. Certain or all non-query files can then be deleted by marking the names of those non-query files in the list. The comparison program can be of the type using either clustering or coalescing, or both, known hashing techniques, or other comparison algorithms.
Abstract:
The present invention provides a facility for selecting from a sequence of natural language characters combinations of characters that may be words. The facility uses probability indications for each of a plurality of words as a function of adjacent characters.
Abstract:
The automated call routing system and method which operates on a call routing objective of a calling expressed in natural speech of the calling party. The system incorporates a speech recognition function, as to which a calling party's natural-speech call routing objective provides an input (15), and which is trained to recognize a plurality of meaningful phrases (10), each such phrase being related to a specific call routing objective. Upon recognition of one or more of such meaningful phrases in a calling party's input speech, an interpretation function (20) then acts on such calling party's routing objective, request to either implement the calling party's requested routing objective, or to enter into a dialog (25), with the calling party to obtain additional information from which a sufficient confidence level can be attained to implement that routing objective.