摘要:
A dynamic exponential, feature-based, language model is continually adjusted per utterance by a user, based on the user's usage history. This adjustment of the model is done incrementally per user, over a large number of users, each with a unique history. The user history can include previously recognized utterances, text queries, and other user inputs. The history data for a user is processed to derive features. These features are then added into the language model dynamically for that user.
摘要:
A dynamic exponential, feature-based, language model is continually adjusted per utterance by a user, based on the user's usage history. This adjustment of the model is done incrementally per user, over a large number of users, each with a unique history. The user history can include previously recognized utterances, text queries, and other user inputs. The history data for a user is processed to derive features. These features are then added into the language model dynamically for that user.
摘要:
Query history expansion may be provided. Upon receiving a spoken query from a user, an adapted language model may be applied to convert the spoken query to text. The adapted language model may comprise a plurality of queries interpolated from the user's previous queries and queries associated with other users. The spoken query may be executed and the results of the spoken query may be provided to the user.
摘要:
A speech recognition system uses multiple confidence thresholds to improve the quality of speech recognition results. The choice of which confidence threshold to use for a particular utterance may be based on one or more features relating to the utterance. In one particular implementation, the speech recognition system includes a speech recognition engine that provides speech recognition results and a confidence score for an input utterance. The system also includes a threshold selection component that determines, based on the received input utterance, a threshold value corresponding to the input utterance. The system further includes a threshold component that accepts the recognition results based on a comparison of the confidence score to the threshold value.
摘要:
A semantic error rate calculation may be provided. After receiving a spoken query from a user, the spoken query may be converted to text according to a first speech recognition hypothesis. A plurality of results associated with the converted query may be received and compared to a second plurality of results associated with the converted query.
摘要:
Architecture that employs an overall grammar as a set of context-specific grammars for recognition of an input, each responsible for a specific context, such as subtask category, geographic region, etc. The grammars together cover the entire domain. Moreover, multiple recognitions can be run in parallel against the same input, where each recognition uses one or more of the context-specific grammars. The multiple intermediate recognition results from the different recognizer-grammars are reconciled by running re-recognition using a dynamically composed grammar based on the multiple recognition results and potentially other domain knowledge, or selecting the winner using a statistical classifier operating on classification features extracted from the multiple recognition results and other domain knowledge.
摘要:
A semantic error rate calculation may be provided. After receiving a spoken query from a user, the spoken query may be converted to text according to a first speech recognition hypothesis. A plurality of results associated with the converted query may be received and compared to a second plurality of results associated with the converted query.
摘要:
Sequential speech recognition using two unequal automatic speech recognition (ASR) systems may be provided. The system may provide two sets of vocabulary data. A determination may be made as to whether entries in one set of vocabulary data are likely to be confused with entries in the other set of vocabulary data. If confusion is likely, a decoy entry from one set of the vocabulary data may be placed in the other set of vocabulary data to ensure more efficient and accurate speech recognition processing may take place.
摘要:
Architecture that employs an overall grammar as a set of context-specific grammars for recognition of an input, each responsible for a specific context, such as subtask category, geographic region, etc. The grammars together cover the entire domain. Moreover, multiple recognitions can be run in parallel against the same input, where each recognition uses one or more of the context-specific grammars. The multiple intermediate recognition results from the different recognizer-grammars are reconciled by running re-recognition using a dynamically composed grammar based on the multiple recognition results and potentially other domain knowledge, or selecting the winner using a statistical classifier operating on classification features extracted from the multiple recognition results and other domain knowledge.
摘要:
Query history expansion may be provided. Upon receiving a spoken query from a user, an adapted language model may be applied to convert the spoken query to text. The adapted language model may comprise a plurality of queries interpolated from the user's previous queries and queries associated with other users. The spoken query may be executed and the results of the spoken query may be provided to the user.