System and method for advanced turn-taking for interactive spoken dialog systems
    1.
    发明授权
    System and method for advanced turn-taking for interactive spoken dialog systems 有权
    交互式口语对话系统的高级转向系统和方法

    公开(公告)号:US09378738B2

    公开(公告)日:2016-06-28

    申请号:US14565516

    申请日:2014-12-10

    CPC classification number: G10L15/222 G10L15/04 G10L15/05 G10L15/063 G10L15/083

    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for advanced turn-taking in an interactive spoken dialog system. A system configured according to this disclosure can incrementally process speech prior to completion of the speech utterance, and can communicate partial speech recognition results upon finding particular conditions. A first condition which, if found, allows the system to communicate partial speech recognition results, is that the most recent word found in the partial results is statistically likely to be the termination of the utterance, also known as a terminal node. A second condition is the determination that all search paths within a speech lattice converge to a common node, also known as a pinch node, before branching out again. Upon finding either condition, the system can communicate the partial speech recognition results. Stability and correctness probabilities can also determine which partial results are communicated.

    Abstract translation: 这里公开了用于交互式口语对话系统中的高级转弯的系统,方法和非暂时的计算机可读存储介质。 根据本公开配置的系统可以在完成语音话语之前递增地处理语音,并且可以在发现特定条件时传送部分语音识别结果。 如果发现,允许系统传达部分语音识别结果的第一个条件是,在部分结果中发现的最新字词在统计上可能是话语的终止,也称为终端节点。 第二个条件是确定在再次分支之前,语音网格内的所有搜索路径都会收敛到公共节点,也称为收缩节点。 在找到任一条件后,系统可以传达部分语音识别结果。 稳定性和正确性概率也可以确定传达哪些部分结果。

    Incremental speech recognition for dialog systems
    2.
    发明授权
    Incremental speech recognition for dialog systems 有权
    对话系统的增量语音识别

    公开(公告)号:US09015048B2

    公开(公告)日:2015-04-21

    申请号:US13691005

    申请日:2012-11-30

    CPC classification number: G10L15/1822

    Abstract: A system and method for integrating incremental speech recognition in dialog systems. An example system configured to practice the method receives incremental speech recognition results of user speech as part of a dialog with a user, and copies a dialog manager operating on the user speech to generate temporary instances of the dialog manager. Then the system evaluates actions the temporary instances of the dialog manager would take based on the incremental speech recognition results, and identifies an action that would advance the dialog and a corresponding temporary instance of the dialog manager. The system can then execute the action in the dialog and optionally replace the dialog manager with the corresponding temporary instance of the dialog manager. The action can include making a turn-taking decision in the dialog, such as whether, what, and when to speak or whether to be silent.

    Abstract translation: 一种用于在对话系统中集成增量语音识别的系统和方法。 配置为实施该方法的示例系统接收用户语音的增量语音识别结果作为与用户的对话的一部分,并且复制在用户语音上操作的对话管理器以生成对话管理器的临时实例。 然后,系统基于增量语音识别结果来评估对话管理器的临时实例将采取的操作,并且识别将推进对话框的操作和对话管理器的相应临时实例。 然后,系统可以在对话框中执行操作,并可选择将对话管理器替换为对话管理器的相应临时实例。 该行动可以包括在对话中做出转向决定,例如是否,什么,什么时候说话,还是沉默。

    Targeted clarification questions in speech recognition with concept presence score and concept correctness score

    公开(公告)号:US09953644B2

    公开(公告)日:2018-04-24

    申请号:US14557030

    申请日:2014-12-01

    CPC classification number: G10L15/22 G10L15/01 G10L15/1822 H04M2250/74

    Abstract: A system, method and computer-readable storage devices are disclosed for using targeted clarification (TC) questions in dialog systems in a multimodal virtual agent system (MVA) providing access to information about movies, restaurants, and musical events. In contrast with open-domain spoken systems, the MVA application covers a domain with a fixed set of concepts and uses a natural language understanding (NLU) component to mark concepts in automatically recognized speech. Instead of identifying an error segment, localized error detection (LED) identifies which of the concepts are likely to be present and correct using domain knowledge, automatic speech recognition (ASR), and NLU tags and scores. If at least concept is identified to be present but not correct, the TC component uses this information to generate a targeted clarification question. This approach computes probability distributions of concept presence and correctness for each user utterance, which can apply to automatic learning for clarification policies.

    System and method for multi-agent architecture for interactive machines
    6.
    发明授权
    System and method for multi-agent architecture for interactive machines 有权
    用于交互式机器的多代理架构的系统和方法

    公开(公告)号:US09530412B2

    公开(公告)日:2016-12-27

    申请号:US14473288

    申请日:2014-08-29

    Inventor: Ethan Selfridge

    CPC classification number: G10L15/22 G10L15/222 G10L2015/227

    Abstract: Systems, methods, and computer-readable storage devices are for an event-driven multi-agent architecture improves via a semi-hierarchical multi-agent reinforcement learning approach. A system receives a user input during a speech dialog between a user and the system. The system then processes the user input, identifying an importance of the user input to the speech dialog based on a user classification and identifying a variable strength turn-taking signal inferred from the user input. An utterance selection agent selects an utterance for replying to the user input based on the importance of the user input, and a turn-taking agent determines whether to output the utterance based on the utterance, and the variable strength turn-taking signal. When the turn-taking agent indicates the utterance should be output, the system selects when to output the utterance.

    Abstract translation: 系统,方法和计算机可读存储设备用于通过半层次多代理强化学习方法改进的事件驱动的多代理架构。 系统在用户和系统之间的语音对话期间接收用户输入。 系统然后处理用户输入,基于用户分类识别用户输入到语音对话的重要性,并且识别从用户输入推断的可变强度转向信号。 话音选择代理基于用户输入的重要性来选择用于回复用户输入的话语,并且转向代理确定是否基于话语输出话语,以及可变强度转向信号。 当转机指示应该输出话语时,系统选择何时输出话语。

Patent Agency Ranking