摘要:
A system and method are provided for receiving speech and/or non-speech communications of natural language questions and/or commands and executing the questions and/or commands. The invention provides a conversational human-machine interface that includes a conversational speech analyzer, a general cognitive model, an environmental model, and a personalized cognitive model to determine context, domain knowledge, and invoke prior information to interpret a spoken utterance or a received non-spoken message. The system and method creates, stores, and uses extensive personal profile information for each user, thereby improving the reliability of determining the context of the speech or non-speech communication and presenting the expected results for a particular question or command.
摘要:
The system and method described herein may provide advertisements in an integrated voice navigation services environment. In particular, one or more advertisements may be identified based on affinities among a current location associated with a navigation device and shared knowledge and information used to interpret natural language utterances that relate to a navigation context, wherein the one or more advertisements may then be presented via a multi-modal output. As such, the shared knowledge and the information relating to the navigation context may provide the system and method with dynamic awareness relating to context, available information sources, domain knowledge, and user behavior and preferences, among other things, which may be used to deliver targeted and contextually relevant advertisements in the integrated navigation services environment.
摘要:
The methods and systems described herein may asynchronously process natural language utterances to provide real-time response performance and natural interaction with users. In particular, the methods and systems described herein may use various natural language speech recognition and interpretation components to identify a request (e.g., a query or command) in an utterance. The request identified in the utterance may then be processed with one or more domain agents, which may submit duplicate queries to multiple different data sources to process the request. The domain agents may then asynchronously evaluate responses to the duplicate queries to return results to users in a timely and natural manner, and further to account the fact that the different data sources may respond to the queries at different speeds, provide unsatisfactory responses to the queries, or fail to respond to the queries at all.
摘要:
The systems and methods described herein may filter and eliminate noise from natural language utterances to improve accuracy associated with speech recognition and parsing capabilities. In particular, the systems and methods described herein may use a microphone array to provide directional signal capture, noise elimination, and cross-talk reduction associated with an input speech signal. Furthermore, a filter arranged between the microphone array and a speech coder may use band shaping, notch filtering, and adaptive echo cancellation to optimize a signal-to-noise ratio associated with the speech signal. The speech signal may then be sent to the speech coder, which may use adaptive lossy audio compression to optimize bandwidth requirements associated with transmitting the speech signal to a main unit that provides the speech recognition, parsing, and other natural language processing capabilities.
摘要:
The systems and methods described herein may recognize natural language utterances that include queries and/or commands and execute the queries and/or commands based on user-specific profiles. The systems and methods described herein may include a complete speech-based information query, retrieval, presentation and command environment that makes significant use of context, prior information, domain knowledge, and the user-specific profiles to achieve a natural environment for one or more users making queries or commands in multiple domains. Through this integrated approach, a complete speech-based natural language query and response environment can be created and tailored to specific users. For example, the systems and methods described herein may create, store, and use extensive personal profile information for different users, thereby improving the reliability of determining the context and presenting the results that the specific users may expect for a particular question or command.
摘要:
Systems and methods for receiving natural language queries and/or commands and execute the queries and/or commands. The systems and methods overcome the deficiencies of prior art speech query and response systems through the application of a complete speech-based information query, retrieval, presentation and command environment. This environment makes significant use of context, prior information, domain knowledge, and user specific profile data to achieve a natural environment for one or more users making queries or commands in multiple domains. Through this integrated approach, a complete speech-based natural language query and response environment can be created. The systems and methods creates, stores and uses extensive personal profile information for each user, thereby improving the reliability of determining the context and presenting the expected results for a particular question or command.
摘要:
A system and method for an integrated, multi-modal, multi-device natural language voice services environment may be provided. In particular, the environment may include a plurality of voice-enabled devices each having intent determination capabilities for processing multi-modal natural language inputs in addition to knowledge of the intent determination capabilities of other devices in the environment. Further, the environment may be arranged in a centralized manner, a distributed peer-to-peer manner, or various combinations thereof. As such, the various devices may cooperate to determine intent of multi-modal natural language inputs, and commands, queries, or other requests may be routed to one or more of the devices best suited to take action in response thereto.
摘要:
A system and method are provided for receiving speech and/or non-speech communications of natural language questions and/or commands and executing the questions and/or commands. The invention provides a conversational human-machine interface that includes a conversational speech analyzer, a general cognitive model, an environmental model, and a personalized cognitive model to determine context, domain knowledge, and invoke prior information to interpret a spoken utterance or a received non-spoken message. The system and method creates, stores and uses extensive personal profile information for each user, thereby improving the reliability of determining the context of the speech or non-speech communication and presenting the expected results for a particular question or command.
摘要:
A cooperative conversational voice user interface is provided. The cooperative conversational voice user interface may build upon short-term and long-term shared knowledge to generate one or more explicit and/or implicit hypotheses about an intent of a user utterance. The hypotheses may be ranked based on varying degrees of certainty, and an adaptive response may be generated for the user. Responses may be worded based on the degrees of certainty and to frame an appropriate domain for a subsequent utterance. In one implementation, misrecognitions may be tolerated, and conversational course may be corrected based on subsequent utterances and/or responses.
摘要:
An enhanced system for speech interpretation is provided. The system may include receiving a user verbalization and generating one or more preliminary interpretations of the verbalization by identifying one or more phonemes in the verbalization. An acoustic grammar may be used to map the phonemes to syllables or words, and the acoustic grammar may include one or more linking elements to reduce a search space associated with the grammar. The preliminary interpretations may be subject to various post-processing techniques to sharpen accuracy of the preliminary interpretation. A heuristic model may assign weights to various parameters based on a context, a user profile, or other domain knowledge. A probable interpretation may be identified based on a confidence score for each of a set of candidate interpretations generated by the heuristic model. The model may be augmented or updated based on various information associated with the interpretation of the verbalization.