Abstract:
Systems and methods for providing a user adaptive natural language interface are disclosed. The disclosed embodiments may receive and analyze user input to derive current user behavior data, including data indicative of characteristics of the user input. The user input is classified based on prior user behavior data previously logged during one or more previous user-system interactions and the current user behavior data to generate a classification of the user input. Machine learning algorithms can be employed to classify the user input. User adaptive utterances are selected based on the user input and the classification of the user input. The user-system interaction is logged for use as prior user behavior data in future user-system interactions. A response to the user input is generated, including synthesizing output speech from the user adaptive utterances selected. Example applications of the disclosed systems and methods provide user adaptive navigation directions in navigation systems.
Abstract:
The present disclosure describes dynamically adjusting linguistic models for automatic speech recognition based on biometric information to produce a more reliable speech recognition experience. Embodiments include receiving a speech signal, receiving a biometric signal from a biometric sensor implemented at least partially in hardware, determining a linguistic model based on the biometric signal, and processing the speech signal for speech recognition using the linguistic model based on the biometric signal.
Abstract:
Systems and methods are disclosed for providing non-lexical cues in synthesized speech. Original text is analyzed to determine characteristics of the text and/or to derive or augment an intent (e.g., an intent code). Non-lexical cue insertion points are determined based on the characteristics of the text and/or the intent. One or more non-lexical cues are inserted at insertion points to generate augmented text. The augmented text is synthesized into speech, including converting the non-lexical cues to speech output.