摘要:
A speech recognition method includes the steps of receiving input speech containing vocabulary, processing the input speech with a grammar to obtain N-best hypotheses and associated parameter values, and determining whether a first-best hypothesis of the N-best hypotheses is confusable with any vocabulary within the grammar. The first-best hypothesis is accepted as recognized speech corresponding to the received input speech if the first-best hypothesis is not determined to be confusable with any vocabulary within the grammar. Where the first-best hypothesis is determined to be confusable, at least one parameter value of the first-best hypothesis can be compared to at least one threshold value. The first-best hypothesis can be accepted as recognized speech corresponding to the received input speech, if the parameter value of the first-best hypothesis is greater than the threshold value.
摘要:
A speech recognition method includes the steps of receiving input speech containing vocabulary, processing the input speech with a grammar to obtain N-best hypotheses and associated parameter values, and determining whether a first-best hypothesis of the N-best hypotheses is confusable with any vocabulary within the grammar. The first-best hypothesis is accepted as recognized speech corresponding to the received input speech if the first-best hypothesis is not determined to be confusable with any vocabulary within the grammar. Where the first-best hypothesis is determined to be confusable, at least one parameter value of the first-best hypothesis can be compared to at least one threshold value, and accepting the second-best as the recognized speech, if its confidence score is within certain lower and upper threshold values and is not confusable with the first-best. The first-best hypothesis can be accepted as recognized speech corresponding to the received input speech, if the parameter value of the first-best hypothesis is greater than the threshold value.
摘要:
A voice dialing method includes the steps of receiving an utterance from a user, decoding the utterance to identify a recognition result for the utterance, and communicating to the user the recognition result. If an indication is received from the user that the communicated recognition result is incorrect, then it is added to a rejection reference. Then, when the user repeats the misunderstood utterance, the rejection reference can be used to eliminate the incorrect recognition result as a potential subsequent recognition result. The method can be used for single or multiple digits or digit strings.
摘要:
A voice dialing method includes the steps of receiving an utterance from a user, decoding the utterance to identify a recognition result for the utterance, and communicating to the user the recognition result. If an indication is received from the user that the communicated recognition result is incorrect, then it is added to a rejection reference. Then, when the user repeats the misunderstood utterance, the rejection reference can be used to eliminate the incorrect recognition result as a potential subsequent recognition result. The method can be used for single or multiple digits or digit strings.
摘要:
A voice dialing method includes the steps of receiving an utterance from a user, decoding the utterance to identify a recognition result for the utterance, and communicating to the user the recognition result. If an indication is received from the user that the communicated recognition result is incorrect, then it is added to a rejection reference. Then, when the user repeats the misunderstood utterance, the rejection reference can be used to eliminate the incorrect recognition result as a potential subsequent recognition result. The method can be used for single or multiple digits or digit strings.
摘要:
A voice dialing method includes the steps of receiving an utterance from a user, decoding the utterance to identify a recognition result for the utterance, and communicating to the user the recognition result. If an indication is received from the user that the communicated recognition result is incorrect, then it is added to a rejection reference. Then, when the user repeats the misunderstood utterance, the rejection reference can be used to eliminate the incorrect recognition result as a potential subsequent recognition result. The method can be used for single or multiple digits or digit strings.
摘要:
A method of circumstantial speech recognition in a vehicle. A plurality of parameters associated with a plurality of vehicle functions are monitored as an indication of current vehicle circumstances. At least one vehicle function is identified as a candidate for user-intended ASR control based on user interaction with the vehicle. The identified vehicle function is then used to disambiguate between potential commands contained in speech received from the user.
摘要:
A method of ambient noise injection for use with speech recognition in a production vehicle. The method includes the steps of monitoring audio including user speech, receiving an utterance from the user speech, retrieving vehicle-specific ambient noise, and prepending the vehicle-specific ambient noise to the utterance before pre-processing and decoding the utterance.
摘要:
A method of ambient noise injection for use with speech recognition in a production vehicle. The method includes the steps of monitoring audio including user speech, receiving an utterance from the user speech, retrieving vehicle-specific ambient noise, and prepending the vehicle-specific ambient noise to the utterance before pre-processing and decoding the utterance.
摘要:
A speech recognition method includes receiving input speech from a user, processing the input speech using a first grammar to obtain parameter values of a first N-best list of vocabulary, comparing a parameter value of a top result of the first N-best list to a threshold value, and if the compared parameter value is below the threshold value, then additionally processing the input speech using a second grammar to obtain parameter values of a second N-best list of vocabulary. Other preferred steps include: determining the input speech to be in-vocabulary if any of the results of the first N-best list is also present within the second N-best list, but out-of-vocabulary if none of the results of the first N-best list is within the second N-best list; and providing audible feedback to the user if the input speech is determined to be out-of-vocabulary.