System and method of supporting adaptive misrecognition in conversational speech
    1.
    发明授权
    System and method of supporting adaptive misrecognition in conversational speech 有权
    在会话语音中支持自适应误识别的系统和方法

    公开(公告)号:US08620659B2

    公开(公告)日:2013-12-31

    申请号:US13022370

    申请日:2011-02-07

    IPC分类号: G10L15/18 G10L15/00 G10L21/00

    摘要: A system and method are provided for receiving speech and/or non-speech communications of natural language questions and/or commands and executing the questions and/or commands. The invention provides a conversational human-machine interface that includes a conversational speech analyzer, a general cognitive model, an environmental model, and a personalized cognitive model to determine context, domain knowledge, and invoke prior information to interpret a spoken utterance or a received non-spoken message. The system and method creates, stores, and uses extensive personal profile information for each user, thereby improving the reliability of determining the context of the speech or non-speech communication and presenting the expected results for a particular question or command.

    摘要翻译: 提供了一种用于接收自然语言问题和/或命令的语音和/或非语音通信并执行问题和/或命令的系统和方法。 本发明提供了一种对话式人机界面,其包括对话语音分析器,一般认知模型,环境模型和个性化认知模型,以确定上下文,领域知识,以及调用先前信息来解释口语发音或接收到的非语音 说话的消息。 系统和方法为每个用户创建,存储和使用广泛的个人简档信息,从而提高确定语音或非语音通信的上下文的可靠性并呈现特定问题或命令的预期结果。

    Method and system for asynchronously processing natural language utterances
    3.
    发明授权
    Method and system for asynchronously processing natural language utterances 有权
    用于异步处理自然语言话语的方法和系统

    公开(公告)号:US08155962B2

    公开(公告)日:2012-04-10

    申请号:US12838982

    申请日:2010-07-19

    摘要: The methods and systems described herein may asynchronously process natural language utterances to provide real-time response performance and natural interaction with users. In particular, the methods and systems described herein may use various natural language speech recognition and interpretation components to identify a request (e.g., a query or command) in an utterance. The request identified in the utterance may then be processed with one or more domain agents, which may submit duplicate queries to multiple different data sources to process the request. The domain agents may then asynchronously evaluate responses to the duplicate queries to return results to users in a timely and natural manner, and further to account the fact that the different data sources may respond to the queries at different speeds, provide unsatisfactory responses to the queries, or fail to respond to the queries at all.

    摘要翻译: 本文描述的方法和系统可以异步地处理自然语言话语以提供实时响应性能和与用户的自然交互。 特别地,本文描述的方法和系统可以使用各种自然语言语音识别和解释组件以话语来识别请求(例如,查询或命令)。 然后可以用一个或多个域代理来处理在话语中识别的请求,该域代理可以向多个不同的数据源提交重复的查询来处理该请求。 领域代理可以异步地评估对重复查询的响应,以及时和自然的方式将结果返回给用户,并且进一步考虑到不同数据源可以以不同速度响应查询的事实,对查询提供不令人满意的响应 ,或者根本没有响应查询。

    System and method for filtering and eliminating noise from natural language utterances to improve speech recognition and parsing
    4.
    发明授权
    System and method for filtering and eliminating noise from natural language utterances to improve speech recognition and parsing 有权
    用于过滤和消除自然语言话语中的噪声以改善语音识别和解析的系统和方法

    公开(公告)号:US08140327B2

    公开(公告)日:2012-03-20

    申请号:US12765753

    申请日:2010-04-22

    IPC分类号: G10L15/20 G10L21/02

    摘要: The systems and methods described herein may filter and eliminate noise from natural language utterances to improve accuracy associated with speech recognition and parsing capabilities. In particular, the systems and methods described herein may use a microphone array to provide directional signal capture, noise elimination, and cross-talk reduction associated with an input speech signal. Furthermore, a filter arranged between the microphone array and a speech coder may use band shaping, notch filtering, and adaptive echo cancellation to optimize a signal-to-noise ratio associated with the speech signal. The speech signal may then be sent to the speech coder, which may use adaptive lossy audio compression to optimize bandwidth requirements associated with transmitting the speech signal to a main unit that provides the speech recognition, parsing, and other natural language processing capabilities.

    摘要翻译: 本文描述的系统和方法可以从自然语言话语中过滤和消除噪声,以提高与语音识别和解析能力相关的精度。 特别地,本文描述的系统和方法可以使用麦克风阵列来提供与输入语音信号相关联的定向信号捕获,噪声消除和串扰降低。 此外,布置在麦克风阵列和语音编码器之间的滤波器可以使用频带整形,陷波滤波和自适应回波消除来优化与语音信号相关联的信噪比。 语音信号然后可以被发送到语音编码器,语音编码器可以使用自适应有损音频压缩来优化与将语音信号发送到提供语音识别,解析和其他自然语言处理能力的主单元相关联的带宽需求。

    System and method for user-specific speech recognition
    5.
    发明授权
    System and method for user-specific speech recognition 有权
    用户特定语音识别的系统和方法

    公开(公告)号:US08112275B2

    公开(公告)日:2012-02-07

    申请号:US12765733

    申请日:2010-04-22

    IPC分类号: G10L15/06 G10L15/18 G10L15/22

    摘要: The systems and methods described herein may recognize natural language utterances that include queries and/or commands and execute the queries and/or commands based on user-specific profiles. The systems and methods described herein may include a complete speech-based information query, retrieval, presentation and command environment that makes significant use of context, prior information, domain knowledge, and the user-specific profiles to achieve a natural environment for one or more users making queries or commands in multiple domains. Through this integrated approach, a complete speech-based natural language query and response environment can be created and tailored to specific users. For example, the systems and methods described herein may create, store, and use extensive personal profile information for different users, thereby improving the reliability of determining the context and presenting the results that the specific users may expect for a particular question or command.

    摘要翻译: 本文描述的系统和方法可以识别包括查询和/或命令的自然语言话语,并且基于用户特定的简档来执行查询和/或命令。 本文描述的系统和方法可以包括完整的基于语音的信息查询,检索,呈现和命令环境,其使上下文,先前信息,域知识和用户特定简档的重要用途实现一个或多个 用户在多个域中进行查询或命令。 通过这种综合方法,可以为特定用户创建一个完整的基于语音的自然语言查询和响应环境。 例如,本文描述的系统和方法可以为不同用户创建,存储和使用广泛的个人简档信息,从而提高确定上下文的可靠性并呈现特定用户可能针对特定问题或命令期望的结果。

    System and method for an integrated, multi-modal, multi-device natural language voice services environment
    7.
    发明授权
    System and method for an integrated, multi-modal, multi-device natural language voice services environment 有权
    一种综合多模态多设备自然语言语音服务环境的系统和方法

    公开(公告)号:US08589161B2

    公开(公告)日:2013-11-19

    申请号:US12127343

    申请日:2008-05-27

    IPC分类号: G10L15/04 G10L15/00 G10L15/18

    摘要: A system and method for an integrated, multi-modal, multi-device natural language voice services environment may be provided. In particular, the environment may include a plurality of voice-enabled devices each having intent determination capabilities for processing multi-modal natural language inputs in addition to knowledge of the intent determination capabilities of other devices in the environment. Further, the environment may be arranged in a centralized manner, a distributed peer-to-peer manner, or various combinations thereof. As such, the various devices may cooperate to determine intent of multi-modal natural language inputs, and commands, queries, or other requests may be routed to one or more of the devices best suited to take action in response thereto.

    摘要翻译: 可以提供用于集成的多模式多设备自然语言语音服务环境的系统和方法。 特别地,环境可以包括多个支持语音的设备,每个具有用于处理多模式自然语言输入的意图确定能力以及环境中其他设备的意图确定能力的知识。 此外,环境可以以集中方式,分布式对等方式或其各种组合来布置。 因此,各种设备可以协作以确定多模态自然语言输入的意图,并且命令,查询或其他请求可以被路由到最适合于响应于此采取动作的一个或多个设备。

    System and method of supporting adaptive misrecognition conversational speech
    8.
    发明授权
    System and method of supporting adaptive misrecognition conversational speech 有权
    在会话语音中支持自适应误识别的系统和方法

    公开(公告)号:US08332224B2

    公开(公告)日:2012-12-11

    申请号:US12571795

    申请日:2009-10-01

    IPC分类号: G10L15/18 G10L15/00 G10L21/00

    摘要: A system and method are provided for receiving speech and/or non-speech communications of natural language questions and/or commands and executing the questions and/or commands. The invention provides a conversational human-machine interface that includes a conversational speech analyzer, a general cognitive model, an environmental model, and a personalized cognitive model to determine context, domain knowledge, and invoke prior information to interpret a spoken utterance or a received non-spoken message. The system and method creates, stores and uses extensive personal profile information for each user, thereby improving the reliability of determining the context of the speech or non-speech communication and presenting the expected results for a particular question or command.

    摘要翻译: 提供了一种用于接收自然语言问题和/或命令的语音和/或非语音通信并执行问题和/或命令的系统和方法。 本发明提供了一种对话式人机界面,其包括对话语音分析器,一般认知模型,环境模型和个性化认知模型,以确定上下文,领域知识,以及调用先前信息来解释口语发音或接收到的非语音 说话的消息。 系统和方法为每个用户创建,存储和使用广泛的个人简档信息,从而提高确定语音或非语音通信的上下文的可靠性并呈现特定问题或命令的预期结果。

    System and method for a cooperative conversational voice user interface
    9.
    发明授权
    System and method for a cooperative conversational voice user interface 有权
    用于协作会话语音用户界面的系统和方法

    公开(公告)号:US08073681B2

    公开(公告)日:2011-12-06

    申请号:US11580926

    申请日:2006-10-16

    IPC分类号: G06F17/27 G10L21/00 G10L11/00

    摘要: A cooperative conversational voice user interface is provided. The cooperative conversational voice user interface may build upon short-term and long-term shared knowledge to generate one or more explicit and/or implicit hypotheses about an intent of a user utterance. The hypotheses may be ranked based on varying degrees of certainty, and an adaptive response may be generated for the user. Responses may be worded based on the degrees of certainty and to frame an appropriate domain for a subsequent utterance. In one implementation, misrecognitions may be tolerated, and conversational course may be corrected based on subsequent utterances and/or responses.

    摘要翻译: 提供了一种协作会话语音用户界面。 合作对话语音用户界面可以建立在短期和长期共享知识之上,以产生关于用户话语意图的一个或多个显式和/或隐含假设。 可以基于不同程度的确定性对假设进行排序,并且可以为用户生成自适应响应。 响应可以基于确定性的程度来描述,并为后续发音构建适当的域。 在一个实现中,可以容忍误识别,并且可以基于后续话语和/或响应来纠正对话课程。

    Dynamic speech sharpening
    10.
    发明授权
    Dynamic speech sharpening 有权
    动态语音锐化

    公开(公告)号:US08069046B2

    公开(公告)日:2011-11-29

    申请号:US12608572

    申请日:2009-10-29

    IPC分类号: G10L15/00

    摘要: An enhanced system for speech interpretation is provided. The system may include receiving a user verbalization and generating one or more preliminary interpretations of the verbalization by identifying one or more phonemes in the verbalization. An acoustic grammar may be used to map the phonemes to syllables or words, and the acoustic grammar may include one or more linking elements to reduce a search space associated with the grammar. The preliminary interpretations may be subject to various post-processing techniques to sharpen accuracy of the preliminary interpretation. A heuristic model may assign weights to various parameters based on a context, a user profile, or other domain knowledge. A probable interpretation may be identified based on a confidence score for each of a set of candidate interpretations generated by the heuristic model. The model may be augmented or updated based on various information associated with the interpretation of the verbalization.

    摘要翻译: 提供了一个增强的语音解释系统。 该系统可以包括通过识别语言中的一个或多个音素来接收用户的言语表达和产生语言表达的一个或多个初步解释。 可以使用声学语法将音素映射到音节或单词,并且声学语法可以包括一个或多个连接元件以减少与语法相关联的搜索空间。 初步解释可能需要采用各种后处理技术来提高初步解释的准确性。 启发式模型可以基于上下文,用户简档或其他域知识为各种参数分配权重。 可以基于由启发式模型生成的一组候选解释中的每一个的置信度分数来识别可能的解释。 该模型可以基于与语言解释的解释相关的各种信息进行扩充或更新。