摘要:
Processes capable of accepting linguistic input in one or more languages are generated by re-using existing linguistic components associated with a different anchor language, together with machine translation components that translate between the anchor language and the one or more languages. Linguistic input is directed to machine translation components that translate such input from its language into the anchor language. Those existing linguistic components are then utilized to initiate responsive processing and generate output. Optionally, the output is directed through the machine translation components. A language identifier can initially receive linguistic input and identify the language within which such linguistic input is provided to select an appropriate machine translation component. A hybrid process, comprising machine translation components and linguistic components associated with the anchor language, can also serve as an initiating construct from which a single language process is created over time.
摘要:
Processes capable of accepting linguistic input in one or more languages are generated by re-using existing linguistic components associated with a different anchor language, together with machine translation components that translate between the anchor language and the one or more languages. Linguistic input is directed to machine translation components that translate such input from its language into the anchor language. Those existing linguistic components are then utilized to initiate responsive processing and generate output. Optionally, the output is directed through the machine translation components. A language identifier can initially receive linguistic input and identify the language within which such linguistic input is provided to select an appropriate machine translation component. A hybrid process, comprising machine translation components and linguistic components associated with the anchor language, can also serve as an initiating construct from which a single language process is created over time.
摘要:
The invention relates to a system and method for gathering data for use in a spoken dialog system. An aspect of the invention is generally referred to as an automated hidden human that performs data collection automatically at the beginning of a conversation with a user in a spoken dialog system. The method comprises presenting an initial prompt to a user, recognizing a received user utterance using an automatic speech recognition engine and classifying the recognized user utterance using a spoken language understanding module. If the recognized user utterance is not understood or classifiable to a predetermined acceptance threshold, then the method re-prompts the user. If the recognized user utterance is not classifiable to a predetermined rejection threshold, then the method transfers the user to a human as this may imply a task-specific utterance. The received and classified user utterance is then used for training the spoken dialog system.
摘要:
A spoken language understanding method and system are provided. The method includes classifying a set of labeled candidate utterances based on a previously trained classifier, generating classification types for each candidate utterance, receiving confidence scores for the classification types from the trained classifier, sorting the classified utterances based on an analysis of the confidence score of each candidate utterance compared to a respective label of the candidate utterance, and rechecking candidate utterances according to the analysis. The system includes modules configured to control a processor in the system to perform the steps of the method.
摘要:
An apparatus and a method are provided for using semantic role labeling for spoken language understanding. A received utterance semantically parsed by semantic role labeling. A predicate or at least one argument is extracted from the semantically parsed utterance. An intent is estimated based on the predicate or the at least one argument. In another aspect, a method is provided for training a spoken language dialog system that uses semantic role labeling. An expert is provided with a group of predicate/argument pairs. Ones of the predicate/argument pairs are selected as intents. Ones of the arguments are selected as named entities. Mappings from the arguments to frame slots are designed.
摘要:
Word lattices that are generated by an automatic speech recognition system are used to generate a modified word lattice that is usable by a spoken language understanding module. In one embodiment, the spoken language understanding module determines a set of salient phrases by calculating an intersection of the modified word lattice, which is optionally preprocessed, and a finite state machine that includes a plurality of salient grammar fragments.
摘要:
A large amount of human labor is required to transcribe and annotate a training corpus that is needed to create and update models for automatic speech recognition (ASR) and spoken language understanding (SLU). Active learning enables a reduction in the amount of transcribed and annotated data required to train ASR and SLU models. In one aspect of the present invention, an active learning ASR process and active learning SLU process are coupled, thereby enabling further efficiencies to be gained relative to a process that maintains an isolation of data in both the ASR and SLU domains.
摘要:
Utterance data that includes at least a small amount of manually transcribed data is provided. Automatic speech recognition is performed on ones of the utterance data not having a corresponding manual transcription to produce automatically transcribed utterances. A model is trained using all of the manually transcribed data and the automatically transcribed utterances. A predetermined number of utterances not having a corresponding manual transcription are intelligently selected and manually transcribed. Ones of the automatically transcribed data as well as ones having a corresponding manual transcription are labeled. In another aspect of the invention, audio data is mined from at least one source, and a language model is trained for call classification from the mined audio data to produce a language model.
摘要:
A system and method are disclosed for providing semi-supervised learning for a spoken language understanding module using semantic role labeling. The method embodiment relates to a method of generating a spoken language understanding module. Steps in the method comprise selecting at least one predicate/argument pair as an intent from a set of the most frequent predicate/argument pairs for a domain, labeling training data using mapping rules associated with the selected at least one predicate/argument pair, training a call-type classification model using the labeled training data, re-labeling the training data using the call-type classification model and iteratively several of the above steps until training set labels converge.
摘要:
Combined active and semi-supervised learning to reduce an amount of manual labeling when training a spoken language understanding model classifier. The classifier may be trained with human-labeled utterance data. Ones of a group of unselected utterance data may be selected for manual labeling via active learning. The classifier may be changed, via semi-supervised learning, based on the selected ones of the unselected utterance data.