SPEECH RECOGNITION FOR KEYWORDS
    11.
    发明申请
    SPEECH RECOGNITION FOR KEYWORDS 审中-公开
    关键词语音识别

    公开(公告)号:US20160335677A1

    公开(公告)日:2016-11-17

    申请号:US14710928

    申请日:2015-05-13

    Applicant: Google Inc.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition are disclosed. In one aspect, a method includes receiving a candidate adword from an advertiser. The method further includes generating a score for the candidate adword based on a likelihood of a speech recognizer generating, based on an utterance of the candidate adword, a transcription that includes a word that is associated with an expected pronunciation of the candidate adword. The method further includes classifying, based at least on the score, the candidate adword as an appropriate adword for use in a bidding process for advertisements that are selected based on a transcription of a speech query or as not an appropriate adword for use in the bidding process for advertisements that are selected based on the transcription of the speech query.

    Abstract translation: 公开了包括用于语音识别的计算机存储介质上编码的计算机程序的方法,系统和装置。 一方面,一种方法包括从广告商接收候选字词。 所述方法还包括基于语音识别器基于所述候选词的话语生成包括与所述候选词的预期发音相关联的单词的转录来生成所述候选词的分数。 该方法还包括至少基于分数将候选词词分类为用于在基于语音查询的转录而选择的广告的投标过程中的适当的词,或者不是用于投标的适当的词语 基于语音查询的转录选择的广告的过程。

    Language model biasing modulation
    12.
    发明授权
    Language model biasing modulation 有权
    语言模型偏置调制

    公开(公告)号:US09460713B1

    公开(公告)日:2016-10-04

    申请号:US14673731

    申请日:2015-03-30

    Applicant: Google Inc.

    CPC classification number: G10L15/07 G10L15/183 G10L15/197 G10L15/24

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for modulating language model biasing. In some implementations, context data is received. A likely context associated with a user is determined based on at least a portion of the context data. One or more language model biasing parameters based at least on the likely context associated with the user is selected. A context confidence score associated with the likely context based on at least a portion of the context data is determined. One or more language model biasing parameters based at least on the context confidence score is adjusted. A baseline language model based at least on the one or more of the adjusted language model biasing parameters is biased. The baseline language model is provided for use by an automated speech recognizer (ASR).

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于调制语言模型偏置。 在一些实现中,接收到上下文数据。 基于上下文数据的至少一部分确定与用户相关联的可能上下文。 选择至少基于与用户相关联的可能上下文的一个或多个语言模型偏置参数。 确定与基于上下文数据的至少一部分的可能上下文相关联的上下文置信度得分。 调整至少基于上下文可信度得分的一个或多个语言模型偏置参数。 至少基于一个或多个经调整的语言模型偏置参数的基准语言模型是有偏见的。 基准语言模型被提供供自动语音识别器(ASR)使用。

    Multi-stage speaker adaptation
    13.
    发明授权
    Multi-stage speaker adaptation 有权
    多级扬声器适配

    公开(公告)号:US08996366B2

    公开(公告)日:2015-03-31

    申请号:US14181908

    申请日:2014-02-17

    Applicant: Google Inc.

    CPC classification number: G10L17/00 G10L15/065 G10L15/07

    Abstract: A first gender-specific speaker adaptation technique may be selected based on characteristics of a first set of feature vectors that correspond to a first unit of input speech. The first set of feature vectors may be configured for use in automatic speech recognition (ASR) of the first unit of input speech. A second set of feature vectors, which correspond to a second unit of input speech, may be modified based on the first gender-specific speaker adaptation technique. The modified second set of feature vectors may be configured for use in ASR of the second unit of input speech. A first speaker-dependent speaker adaptation technique may be selected based on characteristics of the second set of feature vectors. A third set of feature vectors, which correspond to a third unit of speech, may be modified based on the first speaker-dependent speaker adaptation technique.

    Abstract translation: 可以基于对应于第一输入语音单元的第一组特征向量的特征来选择第一个具体的性别的说话者自适应技术。 可以将第一组特征向量配置为用于第一输入语音单元的自动语音识别(ASR)。 可以基于第一性别特异性说话者适应技术来修改对应于第二输入语音单元的第二组特征向量。 经修改的第二组特征向量可以被配置为在第二输入语音单元的ASR中使用。 可以基于第二组特征向量的特征来选择第一说话者相关的说话者自适应技术。 可以基于第一说话者相关的说话人适应技术来修改对应于第三单位语音的第三组特征向量。

    Multi-stage speaker adaptation
    14.
    发明授权

    公开(公告)号:US08571859B1

    公开(公告)日:2013-10-29

    申请号:US13653792

    申请日:2012-10-17

    Applicant: Google Inc.

    CPC classification number: G10L17/00 G10L15/065 G10L15/07

    Abstract: A first gender-specific speaker adaptation technique may be selected based on characteristics of a first set of feature vectors that correspond to a first unit of input speech. The first set of feature vectors may be configured for use in automatic speech recognition (ASR) of the first unit of input speech. A second set of feature vectors, which correspond to a second unit of input speech, may be modified based on the first gender-specific speaker adaptation technique. The modified second set of feature vectors may be configured for use in ASR of the second unit of input speech. A first speaker-dependent speaker adaptation technique may be selected based on characteristics of the second set of feature vectors. A third set of feature vectors, which correspond to a third unit of speech, may be modified based on the first speaker-dependent speaker adaptation technique.

    Determining Dialog States for Language Models

    公开(公告)号:US20170270929A1

    公开(公告)日:2017-09-21

    申请号:US15071651

    申请日:2016-03-16

    Applicant: Google Inc.

    Abstract: Systems, methods, devices, and other techniques are described herein for determining dialog states that correspond to voice inputs and for biasing a language model based on the determined dialog states. In some implementations, a method includes receiving, at a computing system, audio data that indicates a voice input and determining a particular dialog state, from among a plurality of dialog states, which corresponds to the voice input. A set of n-grams can be identified that are associated with the particular dialog state that corresponds to the voice input. In response to identifying the set of n-grams that are associated with the particular dialog state that corresponds to the voice input, a language model can be biased by adjusting probability scores that the language model indicates for n-grams in the set of n-grams. The voice input can be transcribed using the adjusted language model.

    ENHANCED SPEECH ENDPOINTING
    17.
    发明申请

    公开(公告)号:US20170069308A1

    公开(公告)日:2017-03-09

    申请号:US14844563

    申请日:2015-09-03

    Applicant: Google Inc.

    CPC classification number: G10L15/04 G06F17/2765 G10L15/18 G10L2015/228

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving audio data including an utterance, obtaining context data that indicates one or more expected speech recognition results, determining an expected speech recognition result based on the context data, receiving an intermediate speech recognition result generated by a speech recognition engine, comparing the intermediate speech recognition result to the expected speech recognition result for the audio data based on the context data, determining whether the intermediate speech recognition result corresponds to the expected speech recognition result for the audio data based on the context data, and setting an end of speech condition and providing a final speech recognition result in response to determining the intermediate speech recognition result matches the expected speech recognition result, the final speech recognition result including the one or more expected speech recognition results indicated by the context data.

    LANGUAGE MODEL BIASING MODULATION
    18.
    发明申请

    公开(公告)号:US20160379625A1

    公开(公告)日:2016-12-29

    申请号:US15263714

    申请日:2016-09-13

    Applicant: Google Inc.

    CPC classification number: G10L15/07 G10L15/183 G10L15/197 G10L15/24

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for modulating language model biasing. In some implementations, context data is received. A likely context associated with a user is determined based on at least a portion of the context data. One or more language model biasing parameters based at least on the likely context associated with the user is selected. A context confidence score associated with the likely context based on at least a portion of the context data is determined. One or more language model biasing parameters based at least on the context confidence score is adjusted. A baseline language model based at least on the one or more of the adjusted language model biasing parameters is biased. The baseline language model is provided for use by an automated speech recognizer (ASR).

    LANGUAGE MODEL BIASING MODULATION
    19.
    发明申请
    LANGUAGE MODEL BIASING MODULATION 有权
    语言模型偏移调制

    公开(公告)号:US20160293163A1

    公开(公告)日:2016-10-06

    申请号:US14673731

    申请日:2015-03-30

    Applicant: Google Inc.

    CPC classification number: G10L15/07 G10L15/183 G10L15/197 G10L15/24

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for modulating language model biasing. In some implementations, context data is received. A likely context associated with a user is determined based on at least a portion of the context data. One or more language model biasing parameters based at least on the likely context associated with the user is selected. A context confidence score associated with the likely context based on at least a portion of the context data is determined. One or more language model biasing parameters based at least on the context confidence score is adjusted. A baseline language model based at least on the one or more of the adjusted language model biasing parameters is biased. The baseline language model is provided for use by an automated speech recognizer (ASR).

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于调制语言模型偏置。 在一些实现中,接收到上下文数据。 基于上下文数据的至少一部分确定与用户相关联的可能上下文。 选择至少基于与用户相关联的可能上下文的一个或多个语言模型偏置参数。 确定与基于上下文数据的至少一部分的可能上下文相关联的上下文置信度得分。 调整至少基于上下文可信度得分的一个或多个语言模型偏置参数。 至少基于一个或多个经调整的语言模型偏置参数的基准语言模型是有偏见的。 基准语言模型被提供供自动语音识别器(ASR)使用。

    DYNAMICALLY BIASING LANGUAGE MODELS
    20.
    发明申请
    DYNAMICALLY BIASING LANGUAGE MODELS 有权
    动态偏心语言模型

    公开(公告)号:US20160104482A1

    公开(公告)日:2016-04-14

    申请号:US14525826

    申请日:2014-10-28

    Applicant: Google Inc.

    CPC classification number: G10L15/22 G10L15/1815 G10L15/26 G10L15/32 G10L19/00

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition. In one aspect, a method comprises receiving audio data encoding one or more utterances; performing a first speech recognition on the audio data; identifying a context based on the first speech recognition; performing a second speech recognition on the audio data that is biased towards the context; and providing an output of the second speech recognition.

    Abstract translation: 方法,系统和装置,包括编码在计算机存储介质上的用于语音识别的计算机程序。 一方面,一种方法包括接收编码一个或多个话语的音频数据; 对音频数据执行第一语音识别; 基于第一语音识别识别语境; 对偏向于上下文的音频数据执行第二语音识别; 并提供第二语音识别的输出。

Patent Agency Ranking