Underspecification of intents in a natural language processing system

    公开(公告)号:US10216832B2

    公开(公告)日:2019-02-26

    申请号:US15384275

    申请日:2016-12-19

    Abstract: A natural language processing system has a hierarchy of user intents related to a domain of interest, the hierarchy having specific intents corresponding to leaf nodes of the hierarchy, and more general intents corresponding to ancestor nodes of the leaf nodes. The system also has a trained understanding model that can classify natural language utterances according to user intent. When the understanding model cannot determine with sufficient confidence that a natural language utterance corresponds to one of the specific intents, the natural language processing system traverses the hierarchy of intents to find a more general user intent that is related to the most applicable specific intent of the utterance and for which there is sufficient confidence. The general intent can then be used to prompt the user with questions applicable to the general intent to obtain the missing information needed for a specific intent.

    Hierarchical speech recognition decoder

    公开(公告)号:US10096317B2

    公开(公告)日:2018-10-09

    申请号:US15131833

    申请日:2016-04-18

    Abstract: A speech interpretation module interprets the audio of user utterances as sequences of words. To do so, the speech interpretation module parameterizes a literal corpus of expressions by identifying portions of the expressions that correspond to known concepts, and generates a parameterized statistical model from the resulting parameterized corpus. When speech is received the speech interpretation module uses a hierarchical speech recognition decoder that uses both the parameterized statistical model and language sub-models that specify how to recognize a sequence of words. The separation of the language sub-models from the statistical model beneficially reduces the size of the literal corpus needed for training, reduces the size of the resulting model, provides more fine-grained interpretation of concepts, and improves computational efficiency by allowing run-time incorporation of the language sub-models.

    System and method for pronunciation modeling
    25.
    发明授权
    System and method for pronunciation modeling 有权
    发音建模的系统和方法

    公开(公告)号:US09431011B2

    公开(公告)日:2016-08-30

    申请号:US14488844

    申请日:2014-09-17

    CPC classification number: G10L15/187 G10L15/183 G10L2015/025

    Abstract: Systems, computer-implemented methods, and tangible computer-readable media for generating a pronunciation model. The method includes identifying a generic model of speech composed of phonemes, identifying a family of interchangeable phonemic alternatives for a phoneme in the generic model of speech, labeling the family of interchangeable phonemic alternatives as referring to the same phoneme, and generating a pronunciation model which substitutes each family for each respective phoneme. In one aspect, the generic model of speech is a vocal tract length normalized acoustic model. Interchangeable phonemic alternatives can represent a same phoneme for different dialectal classes. An interchangeable phonemic alternative can include a string of phonemes.

    Abstract translation: 系统,计算机实现的方法和用于生成发音模型的有形计算机可读介质。 该方法包括识别由音素组成的通用语音模型,在通用语音模型中识别音素的可互换音素替代品系列,将可互换音素替代品的家族标记为指相同的音素,以及生成发音模型,其中 将每个家庭的每个音素替代。 在一个方面,语音的通用模型是声道长度归一化声学模型。 可互换的音素替代品可以代表不同方言课程的相同音素。 可互换的音素替代品可以包括一串音素。

    System and method for enhancing voice-enabled search based on automated demographic identification
    26.
    发明授权
    System and method for enhancing voice-enabled search based on automated demographic identification 有权
    基于自动人口统计学识别来增强语音搜索的系统和方法

    公开(公告)号:US09189483B2

    公开(公告)日:2015-11-17

    申请号:US13847173

    申请日:2013-03-19

    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for approximating responses to a user speech query in voice-enabled search based on metadata that include demographic features of the speaker. A system practicing the method recognizes received speech from a speaker to generate recognized speech, identifies metadata about the speaker from the received speech, and feeds the recognized speech and the metadata to a question-answering engine. Identifying the metadata about the speaker is based on voice characteristics of the received speech. The demographic features can include age, gender, socio-economic group, nationality, and/or region. The metadata identified about the speaker from the received speech can be combined with or override self-reported speaker demographic information.

    Abstract translation: 本文公开的是基于包括说话者的人口统计特征的元数据的用于在基于语音的搜索中近似对用户语音查询的响应的系统,方法和非暂时计算机可读存储介质。 实施该方法的系统识别来自扬声器的接收到的语音以产生识别的语音,从接收到的语音识别关于说话者的元数据,并将识别的语音和元数据馈送到问答引擎。 识别关于扬声器的元数据是基于所接收语音的语音特征。 人口特征可以包括年龄,性别,社会经济群体,国籍和/或地区。 从接收到的语音中识别的关于说话者的元数据可以与自报告的说话者人口统计信息进行组合或覆盖。

    Method of active learning for automatic speech recognition
    27.
    发明授权
    Method of active learning for automatic speech recognition 有权
    自动语音识别主动学习方法

    公开(公告)号:US08990084B2

    公开(公告)日:2015-03-24

    申请号:US14176439

    申请日:2014-02-10

    CPC classification number: G10L15/063

    Abstract: State-of-the-art speech recognition systems are trained using transcribed utterances, preparation of which is labor-intensive and time-consuming. The present invention is an iterative method for reducing the transcription effort for training in automatic speech recognition (ASR). Active learning aims at reducing the number of training examples to be labeled by automatically processing the unlabeled examples and then selecting the most informative ones with respect to a given cost function for a human to label. The method comprises automatically estimating a confidence score for each word of the utterance and exploiting the lattice output of a speech recognizer, which was trained on a small set of transcribed data. An utterance confidence score is computed based on these word confidence scores; then the utterances are selectively sampled to be transcribed using the utterance confidence scores.

    Abstract translation: 最先进的语音识别系统是使用转录语言进行训练,其准备是劳动密集型和耗时的。 本发明是用于减少自动语音识别(ASR)中训练的转录努力的迭代方法。 主动学习旨在通过自动处理未标记的示例,然后针对人类给定的成本函数选择最具信息的示例来减少要标注的培训示例的数量。 该方法包括自动估计每个词语的置信度,并利用在一小组转录数据上训练的语音识别器的格子输出。 基于这些单词置信度得出一个话语置信度得分; 然后使用话语置信度得分选择性地采样语音以进行转录。

    Dialog management using knowledge graph-driven information state in a natural language processing system

    公开(公告)号:US11288457B1

    公开(公告)日:2022-03-29

    申请号:US16265668

    申请日:2019-02-01

    Abstract: Systems and methods are disclosed for determining a move driven by an interaction. In some embodiments, a processor determines an operational state of an interaction with a user based on parameter values of a data structure. The processor identifies a plurality of candidate moves for changing the operational state by determining a domain in which the interaction is occurring, retrieving a set of candidate moves that correspond to the domain from a knowledge graph, and adding the set to the plurality of candidate moves. The processor encodes input of the user received during the interaction into encoded terms, and determines a move for changing the operational state based on a match of the encoded terms to the set of candidate moves. The processor updates the parameter values of the data structure based on the move to reflect a current operational state led to by the move.

Patent Agency Ranking