摘要:
The method and apparatus utilize a filter to remove a variety of non-dictated words from data based on probability and improve the effectiveness of creating a language model.
摘要:
A speech recognition system is provided with iteratively refined multiple passes through the received data to enhance the accuracy of the results by introducing constraints and adaptation from initial passes into subsequent recognition operations. The multiple passes are performed on an initial utterance received from a user. The iteratively enhanced subsequent passes are also performed on following utterances received from the user increasing an overall system efficiency and accuracy.
摘要:
A local feedback mechanism for customizing training models based on user data and directed user feedback is provided in speech recognition applications. The feedback data is filtered at different levels to address privacy concerns for local storage and for submittal to a system developer for enhancement of generic training models.
摘要:
A speech recognition system is provided with iteratively refined multiple passes through the received data to enhance the accuracy of the results by introducing constraints and adaptation from initial passes into subsequent recognition operations. The multiple passes are performed on an initial utterance received from a user. The iteratively enhanced subsequent passes are also performed on following utterances received from the user increasing an overall system efficiency and accuracy.
摘要:
A local feedback mechanism for customizing training models based on user data and directed user feedback is provided in speech recognition applications. The feedback data is filtered at different levels to address privacy concerns for local storage and for submittal to a system developer for enhancement of generic training models.
摘要:
A speech application is analyzed to automatically identify timing problems therein. Information related to the application is recorded and problems are identified. A recognizer can be utilized to perform recognition on the recorded information.
摘要:
An expected dialog-turn (ED) value is estimated for evaluating a speech application. Parameters such as a confidence threshold setting can be adjusted based on the expected dialog-turn value. In a particular example, recognition results and corresponding confidence scores are used to estimate the expected dialog-turn value. The recognition results can be associated with a possible outcome for the speech application and a cost for the possible outcome can be used to estimate the expected dialog-turn value.
摘要:
An acoustic model includes word-specific models, that are specific to candidate words. The candidate words would otherwise be mapped to a series of general phones. A sub-series of the general phones representing the candidate word is modeled by a new phone and the new phone is dedicated to the candidate word, or a small group of similar words, but the new phone is not shared among all words that otherwise map to the sub-series of general phones.
摘要:
The present invention relates to a method of processing speech, in which input speech is processed to determine an input speech vector (or) representing a sample of the speech. A number of possible output states are defined with each output state (j) being represented by a number of state mixture components (m). Each state mixture component is then approximated by a weighted sum of a number of predetermined generic components (x), allowing the likelihoods of each output states (j) corresponding to the input speech vector (or) to be determined.