SESSION CONTEXT MODELING FOR CONVERSATIONAL UNDERSTANDING SYSTEMS
    1.
    发明公开
    SESSION CONTEXT MODELING FOR CONVERSATIONAL UNDERSTANDING SYSTEMS 有权
    对话理解系统的会话上下文建模

    公开(公告)号:EP3158559A1

    公开(公告)日:2017-04-26

    申请号:EP15736702.0

    申请日:2015-06-17

    摘要: Systems and methods are provided for improving language models for speech recognition by adapting knowledge sources utilized by the language models to session contexts. A knowledge source, such as a knowledge graph, is used to capture and model dynamic session context based on user interaction information from usage history, such as session logs, that is mapped to the knowledge source. From sequences of user interactions, higher level intent sequences may be determined and used to form models that anticipate similar intents but with different arguments including arguments that do not necessarily appear in the usage history. In this way, the session context models may be used to determine likely next interactions or “turns” from a user, given a previous turn or turns. Language models corresponding to the likely next turns are then interpolated and provided to improve recognition accuracy of the next turn received from the user.

    摘要翻译: 通过将语言模型使用的知识源适配到会话上下文中,提供了用于改善语音识别的语言模型的系统和方法。 知识源(例如知识图)用于基于映射到知识源的使用历史记录(例如会话日志)中的用户交互信息来捕获和建模动态会话上下文。 根据用户交互的序列,可以确定更高级别的意图序列并且将其用于形成预测类似意图但具有不同参数的模型,所述参数包括不一定出现在使用历史中的参数。 以这种方式,会话上下文模型可以被用于确定可能的下一个交互或者在给定之前的转向或转向时从用户“转向”。 然后插入并提供对应于可能的下一个回合的语言模型,以提高从用户接收到的下一个回合的识别准确度。

    KNOWLEDGE SOURCE PERSONALIZATION TO IMPROVE LANGUAGE MODELS
    2.
    发明公开
    KNOWLEDGE SOURCE PERSONALIZATION TO IMPROVE LANGUAGE MODELS 审中-公开
    WISSENSQUELLENPERSONALISIERUNG ZUR VERBESSERUNG VON SPRACHMODELLEN

    公开(公告)号:EP3143522A1

    公开(公告)日:2017-03-22

    申请号:EP15728256.7

    申请日:2015-05-15

    IPC分类号: G06F17/30 G10L15/06

    摘要: Systems and methods are provided for improving language models for speech recognition by personalizing knowledge sources utilized by the language models to specific users or user-population characteristics. A knowledge source, such as a knowledge graph, is personalized for a particular user by mapping entities or user actions from usage history for the user, such as query logs, to the knowledge source. The personalized knowledge source may be used to build a personal language model by training a language model with queries corresponding to entities or entity pairs that appear in usage history. In some embodiments, a personalized knowledge source for a specific user can be extended based on personalized knowledge sources of similar users.

    摘要翻译: 提供了系统和方法,用于通过将语言模型所使用的知识源个人化为特定用户或用户群体特征来改进用于语音识别的语言模型。 通过将实体或用户操作与用户的使用历史(例如查询日志)映射到知识源,为特定用户个性化知识源。 个性化知识源可以用于通过训练具有对应于出现在使用历史中的实体或实体对的查询的语言模型来构建个人语言模型。 在一些实施例中,可以基于类似用户的个性化知识源来扩展用于特定用户的个性化知识源。

    SESSION CONTEXT MODELING FOR CONVERSATIONAL UNDERSTANDING SYSTEMS

    公开(公告)号:EP3158559B1

    公开(公告)日:2018-05-23

    申请号:EP15736702.0

    申请日:2015-06-17

    摘要: Systems and methods are provided for improving language models for speech recognition by adapting knowledge sources utilized by the language models to session contexts. A knowledge source, such as a knowledge graph, is used to capture and model dynamic session context based on user interaction information from usage history, such as session logs, that is mapped to the knowledge source. From sequences of user interactions, higher level intent sequences may be determined and used to form models that anticipate similar intents but with different arguments including arguments that do not necessarily appear in the usage history. In this way, the session context models may be used to determine likely next interactions or “turns” from a user, given a previous turn or turns. Language models corresponding to the likely next turns are then interpolated and provided to improve recognition accuracy of the next turn received from the user.

    MODEL BASED APPROACH FOR ON-SCREEN ITEM SELECTION AND DISAMBIGUATION
    5.
    发明公开
    MODEL BASED APPROACH FOR ON-SCREEN ITEM SELECTION AND DISAMBIGUATION 审中-公开
    MODELLBASIERTER ANSATZ ZUR AUSWAHL UND DISAMBIGUIERUNG VON ELEMENTEN AUF EINEM BILDSCHIRM

    公开(公告)号:EP3114582A1

    公开(公告)日:2017-01-11

    申请号:EP15716197.7

    申请日:2015-02-27

    IPC分类号: G06F17/30 G06F3/16

    摘要: A model-based approach for on-screen item selection and disambiguation is provided. An utterance may be received by a computing device in response to a display of a list of items for selection on a display screen. A disambiguation model may then be applied to the utterance. The disambiguation model may be utilized to determine whether the utterance is directed to at least one of the list of displayed items, extract referential features from the utterance and identify an item from the list corresponding to the utterance, based on the extracted referential features. The computing device may then perform an action which includes selecting the identified item associated with utterance.

    摘要翻译: 提供了基于模型的屏幕选择和消歧歧义的方法。 响应于在显示屏幕上显示用于选择的项目的列表,计算设备可以接收发音。 然后可以将消歧模型应用于话语。 可以利用消歧模型来确定话语是否被引导到所显示的项目的列表中的至少一个,基于所提取的参考特征,从话语中提取参考特征并且从对应于话语的列表中识别项目。 然后,计算设备可以执行包括选择与话语相关联的所识别的项目的动作。

    EYE GAZE FOR SPOKEN LANGUAGE UNDERSTANDING IN MULTI-MODAL CONVERSATIONAL INTERACTIONS
    6.
    发明公开
    EYE GAZE FOR SPOKEN LANGUAGE UNDERSTANDING IN MULTI-MODAL CONVERSATIONAL INTERACTIONS 有权
    注意语言在多模式对话交互中的理解

    公开(公告)号:EP3198328A1

    公开(公告)日:2017-08-02

    申请号:EP15778481.0

    申请日:2015-09-25

    摘要: Improving accuracy in understanding and/or resolving references to visual elements in a visual context associated with a computerized conversational system is described. Techniques described herein leverage gaze input with gestures and/or speech input to improve spoken language understanding in computerized conversational systems. Leveraging gaze input and speech input improves spoken language understanding in conversational systems by improving the accuracy by which the system can resolve references—or interpret a user's intent—with respect to visual elements in a visual context. In at least one example, the techniques herein describe tracking gaze to generate gaze input, recognizing speech input, and extracting gaze features and lexical features from the user input. Based at least in part on the gaze features and lexical features, user utterances directed to visual elements in a visual context can be resolved.

    摘要翻译: 描述了在理解和/或解决对与计算机化对话系统相关联的视觉环境中的视觉元素的引用方面提高准确性。 本文描述的技术利用手势和/或语音输入的注视输入来改善计算机化对话系统中的口头语言理解。 利用注视输入和语音输入,可以提高系统在视觉上下文中解析引用或解释用户意图方面的准确性,从而改善会话系统中的口语理解。 在至少一个示例中,本文的技术描述跟踪注视以生成注视输入,识别语音输入,并且从用户输入提取凝视特征和词汇特征。 至少部分基于注视特征和词汇特征,可以解决针对视觉情境中的视觉元素的用户话语。

    BUILDING MULTI-LANGUAGE PROCESSES FROM EXISTING SINGLE-LANGUAGE PROCESSES
    7.
    发明公开
    BUILDING MULTI-LANGUAGE PROCESSES FROM EXISTING SINGLE-LANGUAGE PROCESSES 审中-公开
    生产现有的单语音流程多语言过程综合

    公开(公告)号:EP2847689A2

    公开(公告)日:2015-03-18

    申请号:EP13724058.6

    申请日:2013-05-01

    IPC分类号: G06F17/28 G10L15/26

    CPC分类号: G06F17/289

    摘要: Processes capable of accepting linguistic input in one or more languages are generated by re-using existing linguistic components associated with a different anchor language, together with machine translation components that translate between the anchor language and the one or more languages. Linguistic input is directed to machine translation components that translate such input from its language into the anchor language. Those existing linguistic components are then utilized to initiate responsive processing and generate output. Optionally, the output is directed through the machine translation components. A language identifier can initially receive linguistic input and identify the language within which such linguistic input is provided to select an appropriate machine translation component. A hybrid process, comprising machine translation components and linguistic components associated with the anchor language, can also serve as an initiating construct from which a single language process is created over time.