Abstract:
A method for processing language input can include the step of determining at least two possible meanings for a language input. For each possible meaning, a probability that the possible meaning is a correct interpretation of the language input can be determined. At least one relative data computation can be computed based at least in part upon the probabilities. At least one irregularity within the language input can be detected based upon the relative delta computation. The irregularity can include mumble, ambiguous input, and/or compound input. At least one programmatic action can be performed responsive to the detection of the irregularity.
Abstract:
A method for processing language input can include the step of determining at least two possible meanings for a language input. For each possible meaning, a probability that the possible meaning is a correct interpretation of the language input can be determined. At least one relative data computation can be computed based at least in part upon the probabilities. At least one irregularity within the language input can be detected based upon the relative delta computation. The irregularity can include mumble, ambiguous input, and/or compound input. At least one programmatic action can be performed responsive to the detection of the irregularity.
Abstract:
In a natural language, mixed-initiative system, a method of processing user dialogue can include receiving a user input and determining whether the user input specifies an action to be performed or a token of an action. The user input can be selectively routed to an action interpreter or a token interpreter according to the determining step.
Abstract:
A method for collecting data for statistical modeling purposes can include the step of selecting at least one user interface type from a plurality of previously defined user interface types. Parameters of the selected interface type can be defined for a particular data collection instance. Target participant data can be inputted. A data collection interface based upon the selected interface type and defined parameters can be deployed. Messages can be automatically conveyed to data providers selected in accordance with the target participant data. The data providers can be permitted to access the deployed data collection interface. Data provided by the data providers can be automatically stored and used for statistical modeling purposes related to the data collection instance.
Abstract:
The invention disclosed herein concerns a system (100) and method (600) for building a language model representation of an NLU application. The method 500 can include categorizing an NLU application domain (602), classifying a corpus in view of the categorization (604), and training at least one language model in view of the classification (606). The categorization produces a hierarchical tree of categories, sub-categories and end targets across one or more features for interpreting one or more natural language input requests. During development of an NLU application, a developer assigns sentences of the NLU application to categories, sub-categories or end targets across one or more features for associating each sentence with desire interpretations. A language model builder (140) iteratively builds multiple language models for this sentence data, and iteratively evaluating them against a test corpus, partitioning the data based on the categorization and rebuilding models, so as to produce an optimal configuration of language models to interpret and respond to language input requests for the NLU application.
Abstract:
A method for collecting data for statistical modeling purposes can include the step of selecting at least one user interface type from a plurality of previously defined user interface types. Parameters of the selected interface type can be defined for a particular data collection instance. Target participant data can be inputted. A data collection interface based upon the selected interface type and defined parameters can be deployed. Messages can be automatically conveyed to data providers selected in accordance with the target participant data. The data providers can be permitted to access the deployed data collection interface. Data provided by the data providers can be automatically stored and used for statistical modeling purposes related to the data collection instance.
Abstract:
A method of extracting information from text within a natural language understanding system can include processing a text input through at least one statistical model for each of a plurality of features to be extracted from the text input. For each feature, at least one value can be determined, at least in part, using the statistical model associated with the feature. One value for each feature can be combined to create a complex information target. The complex information target can be output.
Abstract:
In a natural language, mixed-initiative system, a method of processing user dialogue can include receiving a user input and determining whether the user input specifies an action to be performed or a token of an action. The user input can be selectively routed to an action interpreter or a token interpreter according to the determining step.
Abstract:
A method of processing text within a natural language understanding system can include applying a first tokenization technique to a sentence using a statistical tokenization model. A second tokenization technique using a named entity can be applied to the sentence when the first tokenization technique does not extract a needed token according to a class of the sentence. A token determined according to at least one of the tokenization techniques can be output.
Abstract:
A method of extracting information from text within a natural language understanding system can include processing a text input through at least one statistical model for each of a plurality of features to be extracted from the text input. For each feature, at least one value can be determined, at least in part, using the statistical model associated with the feature. One value for each feature can be combined to create a complex information target. The complex information target can be output.