摘要:
A computer-implemented method is disclosed for improving the accuracy of a directory assistance system. The method includes constructing a prefix tree based on a collection of alphabetically organized words. The prefix tree is utilized as a basis for generating splitting rules for a compound word included in an index associated with the directory assistance system. A language model check and a pronunciation check are conducted in order to determine which of the generated splitting rules are mostly likely correct. The compound word is split into word components based on the most likely correct rule or rules. The word components are incorporated into a data set associated with the directory assistance system, such as into a recognition grammar and/or the index.
摘要:
An answering machine detection module is used to determine whether a call recipient is an actual person or an answering machine. The answering machine detection module includes a speech recognizer and a call analysis module. The speech recognizer receives an audible response of the call recipient to a call. The speech recognizer processes the audible response and provides an output indicative of recognized speech. The call analysis module processes the output of the speech recognizer to generate an output indicative of whether the call recipient is a person or an answering machine.
摘要:
The speech recognizer includes a dictation language model providing a dictation model output indicative of a likely word sequence recognized based on an input utterance. A spelling language model provides a spelling model output indicative of a likely letter sequence recognized based on the input utterance. An acoustic model provides an acoustic model output indicative of a likely speech unit recognized based on the input utterances. A speech recognition component is configured to access the dictation language model, the spelling language model and the acoustic model. The speech recognition component weights the dictation model output and the spelling model output in calculating likely recognized speech based on the input utterance. The speech recognizer can also be configured to confine spelled speech to an active lexicon.
摘要:
Systems and methods are described for adding entries to a custom lexicon used by a speech recognition engine of a speech interface in response to user interaction with the speech interface. In one embodiment, a speech signal is obtained when the user speaks a name of a particular item to be selected from among a finite set of items. If a phonetic description of the speech signal is not recognized by the speech recognition engine, then the user is presented with a means for selecting the particular item from among the finite set of items by providing input in a manner that does not include speaking the name of the item. After the user has selected the particular item via the means for selecting, the phonetic description of the speech signal is stored in association with a text description of the particular item in the custom lexicon.
摘要:
A method for managing an interaction of a calling party to a communication partner is provided. The method includes automatically determining if the communication partner expects DTMF input. The method also includes translating speech input to one or more DTMF tones and communicating the one or more DTMF tones to the communication partner, if the communication partner expects DTMF input.
摘要:
The subject disclosure is directed towards training a classifier for spoken utterances without relying on human-assistance. The spoken utterances may be related to a voice menu program for which a speech comprehension component interprets the spoken utterances into voice menu options. The speech comprehension component provides confirmations to some of the spoken utterances in order to accurately assign a semantic label. For each spoken utterance with a denied confirmation, the speech comprehension component automatically generates a pseudo-semantic label that is consistent with the denied confirmation and selected from a set of potential semantic labels and updates a classification model associated with the classifier using the pseudo-semantic label.
摘要:
A method of providing automatic reading tutoring is disclosed. The method includes retrieving a textual indication of a story from a data store and creating a language model including constructing a target context free grammar indicative of a first portion of the story. A first acoustic input is received and a speech recognition engine is employed to recognize the first acoustic input. An output of the speech recognition engine is compared to the language model and a signal indicative of whether the output of the speech recognition matches at least a portion of the target context free grammar is provided.
摘要:
A statistical language model is trained for use in a directory assistance system using the data in a directory assistance listing corpus. Calculations are made to determine how important words in the corpus are in distinguishing a listing from other listings, and how likely words are to be omitted or added by a user. The language model is trained using these calculations.
摘要:
A voice search system has a speech recognizer, a search component, and a dialog manager. A confidence measure generator receives speech recognition features from the speech recognizer, search features from the search component, and dialog features from the dialog manager, and calculates an overall confidence measure for voice search results based upon the features received. The invention can be extended to include the generation of additional features, based on those received from the individual components of the voice search system.