-
公开(公告)号:US20160335677A1
公开(公告)日:2016-11-17
申请号:US14710928
申请日:2015-05-13
Applicant: Google Inc.
Inventor: Petar Aleksic , Pedro J. Moreno Mengibar
CPC classification number: G06Q30/0275 , G06Q30/0256 , G10L13/00 , G10L15/01 , G10L15/06 , G10L15/18 , G10L15/187 , G10L15/26 , G10L2015/088
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition are disclosed. In one aspect, a method includes receiving a candidate adword from an advertiser. The method further includes generating a score for the candidate adword based on a likelihood of a speech recognizer generating, based on an utterance of the candidate adword, a transcription that includes a word that is associated with an expected pronunciation of the candidate adword. The method further includes classifying, based at least on the score, the candidate adword as an appropriate adword for use in a bidding process for advertisements that are selected based on a transcription of a speech query or as not an appropriate adword for use in the bidding process for advertisements that are selected based on the transcription of the speech query.
Abstract translation: 公开了包括用于语音识别的计算机存储介质上编码的计算机程序的方法,系统和装置。 一方面,一种方法包括从广告商接收候选字词。 所述方法还包括基于语音识别器基于所述候选词的话语生成包括与所述候选词的预期发音相关联的单词的转录来生成所述候选词的分数。 该方法还包括至少基于分数将候选词词分类为用于在基于语音查询的转录而选择的广告的投标过程中的适当的词,或者不是用于投标的适当的词语 基于语音查询的转录选择的广告的过程。
-
公开(公告)号:US09460713B1
公开(公告)日:2016-10-04
申请号:US14673731
申请日:2015-03-30
Applicant: Google Inc.
Inventor: Pedro J. Moreno Mengibar , Petar Aleksic
IPC: G10L15/00 , G10L15/197 , G10L15/08
CPC classification number: G10L15/07 , G10L15/183 , G10L15/197 , G10L15/24
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for modulating language model biasing. In some implementations, context data is received. A likely context associated with a user is determined based on at least a portion of the context data. One or more language model biasing parameters based at least on the likely context associated with the user is selected. A context confidence score associated with the likely context based on at least a portion of the context data is determined. One or more language model biasing parameters based at least on the context confidence score is adjusted. A baseline language model based at least on the one or more of the adjusted language model biasing parameters is biased. The baseline language model is provided for use by an automated speech recognizer (ASR).
Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于调制语言模型偏置。 在一些实现中,接收到上下文数据。 基于上下文数据的至少一部分确定与用户相关联的可能上下文。 选择至少基于与用户相关联的可能上下文的一个或多个语言模型偏置参数。 确定与基于上下文数据的至少一部分的可能上下文相关联的上下文置信度得分。 调整至少基于上下文可信度得分的一个或多个语言模型偏置参数。 至少基于一个或多个经调整的语言模型偏置参数的基准语言模型是有偏见的。 基准语言模型被提供供自动语音识别器(ASR)使用。
-
公开(公告)号:US08996366B2
公开(公告)日:2015-03-31
申请号:US14181908
申请日:2014-02-17
Applicant: Google Inc.
Inventor: Petar Aleksic , Xin Lei
IPC: G10L15/07 , G10L17/00 , G10L15/065
CPC classification number: G10L17/00 , G10L15/065 , G10L15/07
Abstract: A first gender-specific speaker adaptation technique may be selected based on characteristics of a first set of feature vectors that correspond to a first unit of input speech. The first set of feature vectors may be configured for use in automatic speech recognition (ASR) of the first unit of input speech. A second set of feature vectors, which correspond to a second unit of input speech, may be modified based on the first gender-specific speaker adaptation technique. The modified second set of feature vectors may be configured for use in ASR of the second unit of input speech. A first speaker-dependent speaker adaptation technique may be selected based on characteristics of the second set of feature vectors. A third set of feature vectors, which correspond to a third unit of speech, may be modified based on the first speaker-dependent speaker adaptation technique.
Abstract translation: 可以基于对应于第一输入语音单元的第一组特征向量的特征来选择第一个具体的性别的说话者自适应技术。 可以将第一组特征向量配置为用于第一输入语音单元的自动语音识别(ASR)。 可以基于第一性别特异性说话者适应技术来修改对应于第二输入语音单元的第二组特征向量。 经修改的第二组特征向量可以被配置为在第二输入语音单元的ASR中使用。 可以基于第二组特征向量的特征来选择第一说话者相关的说话者自适应技术。 可以基于第一说话者相关的说话人适应技术来修改对应于第三单位语音的第三组特征向量。
-
公开(公告)号:US08571859B1
公开(公告)日:2013-10-29
申请号:US13653792
申请日:2012-10-17
Applicant: Google Inc.
Inventor: Petar Aleksic , Xin Lei
IPC: G10L15/00
CPC classification number: G10L17/00 , G10L15/065 , G10L15/07
Abstract: A first gender-specific speaker adaptation technique may be selected based on characteristics of a first set of feature vectors that correspond to a first unit of input speech. The first set of feature vectors may be configured for use in automatic speech recognition (ASR) of the first unit of input speech. A second set of feature vectors, which correspond to a second unit of input speech, may be modified based on the first gender-specific speaker adaptation technique. The modified second set of feature vectors may be configured for use in ASR of the second unit of input speech. A first speaker-dependent speaker adaptation technique may be selected based on characteristics of the second set of feature vectors. A third set of feature vectors, which correspond to a third unit of speech, may be modified based on the first speaker-dependent speaker adaptation technique.
-
公开(公告)号:US09971758B1
公开(公告)日:2018-05-15
申请号:US14989621
申请日:2016-01-06
Applicant: Google Inc.
Inventor: Evgeny A. Cherepanov , Gleb Skobeltsyn , Jakob Nicolaus Foerster , Petar Aleksic , Assaf Avner Hurwitz Michaely
IPC: G10L15/26 , G06F17/27 , G10L15/32 , G10L15/197 , G10L15/187 , G10L15/08
CPC classification number: G06F17/273 , G10L15/187 , G10L15/197 , G10L15/22 , G10L15/26 , G10L15/32 , G10L2015/086
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for natural language processing. One of the methods includes receiving a first voice input from a user device; generating a first recognition output; receiving a user selection of one or more terms in the first recognition output; receiving a second voice input spelling a correction of the user selection; determining a corrected recognition output for the selected portion; and providing a second recognition output that merges the first recognition output and the corrected recognition output.
-
公开(公告)号:US20170270929A1
公开(公告)日:2017-09-21
申请号:US15071651
申请日:2016-03-16
Applicant: Google Inc.
Inventor: Petar Aleksic , Pedro J. Moreno Mengibar
CPC classification number: G10L15/22 , G06F17/278 , G06F17/2785 , G10L15/065 , G10L15/183 , G10L15/197 , G10L15/26
Abstract: Systems, methods, devices, and other techniques are described herein for determining dialog states that correspond to voice inputs and for biasing a language model based on the determined dialog states. In some implementations, a method includes receiving, at a computing system, audio data that indicates a voice input and determining a particular dialog state, from among a plurality of dialog states, which corresponds to the voice input. A set of n-grams can be identified that are associated with the particular dialog state that corresponds to the voice input. In response to identifying the set of n-grams that are associated with the particular dialog state that corresponds to the voice input, a language model can be biased by adjusting probability scores that the language model indicates for n-grams in the set of n-grams. The voice input can be transcribed using the adjusted language model.
-
公开(公告)号:US20170069308A1
公开(公告)日:2017-03-09
申请号:US14844563
申请日:2015-09-03
Applicant: Google Inc.
Inventor: Petar Aleksic , Glen Shires , Michael Buchanan
CPC classification number: G10L15/04 , G06F17/2765 , G10L15/18 , G10L2015/228
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving audio data including an utterance, obtaining context data that indicates one or more expected speech recognition results, determining an expected speech recognition result based on the context data, receiving an intermediate speech recognition result generated by a speech recognition engine, comparing the intermediate speech recognition result to the expected speech recognition result for the audio data based on the context data, determining whether the intermediate speech recognition result corresponds to the expected speech recognition result for the audio data based on the context data, and setting an end of speech condition and providing a final speech recognition result in response to determining the intermediate speech recognition result matches the expected speech recognition result, the final speech recognition result including the one or more expected speech recognition results indicated by the context data.
-
公开(公告)号:US20160379625A1
公开(公告)日:2016-12-29
申请号:US15263714
申请日:2016-09-13
Applicant: Google Inc.
Inventor: Pedro J. Moreno-Mengibar , Petar Aleksic
IPC: G10L15/07 , G10L15/24 , G10L15/183
CPC classification number: G10L15/07 , G10L15/183 , G10L15/197 , G10L15/24
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for modulating language model biasing. In some implementations, context data is received. A likely context associated with a user is determined based on at least a portion of the context data. One or more language model biasing parameters based at least on the likely context associated with the user is selected. A context confidence score associated with the likely context based on at least a portion of the context data is determined. One or more language model biasing parameters based at least on the context confidence score is adjusted. A baseline language model based at least on the one or more of the adjusted language model biasing parameters is biased. The baseline language model is provided for use by an automated speech recognizer (ASR).
-
公开(公告)号:US20160293163A1
公开(公告)日:2016-10-06
申请号:US14673731
申请日:2015-03-30
Applicant: Google Inc.
Inventor: Pedro J. Moreno Mengibar , Petar Aleksic
IPC: G10L15/197 , G10L15/08
CPC classification number: G10L15/07 , G10L15/183 , G10L15/197 , G10L15/24
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for modulating language model biasing. In some implementations, context data is received. A likely context associated with a user is determined based on at least a portion of the context data. One or more language model biasing parameters based at least on the likely context associated with the user is selected. A context confidence score associated with the likely context based on at least a portion of the context data is determined. One or more language model biasing parameters based at least on the context confidence score is adjusted. A baseline language model based at least on the one or more of the adjusted language model biasing parameters is biased. The baseline language model is provided for use by an automated speech recognizer (ASR).
Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于调制语言模型偏置。 在一些实现中,接收到上下文数据。 基于上下文数据的至少一部分确定与用户相关联的可能上下文。 选择至少基于与用户相关联的可能上下文的一个或多个语言模型偏置参数。 确定与基于上下文数据的至少一部分的可能上下文相关联的上下文置信度得分。 调整至少基于上下文可信度得分的一个或多个语言模型偏置参数。 至少基于一个或多个经调整的语言模型偏置参数的基准语言模型是有偏见的。 基准语言模型被提供供自动语音识别器(ASR)使用。
-
公开(公告)号:US20160104482A1
公开(公告)日:2016-04-14
申请号:US14525826
申请日:2014-10-28
Applicant: Google Inc.
Inventor: Petar Aleksic , Pedro J. Moreno Mengibar
CPC classification number: G10L15/22 , G10L15/1815 , G10L15/26 , G10L15/32 , G10L19/00
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition. In one aspect, a method comprises receiving audio data encoding one or more utterances; performing a first speech recognition on the audio data; identifying a context based on the first speech recognition; performing a second speech recognition on the audio data that is biased towards the context; and providing an output of the second speech recognition.
Abstract translation: 方法,系统和装置,包括编码在计算机存储介质上的用于语音识别的计算机程序。 一方面,一种方法包括接收编码一个或多个话语的音频数据; 对音频数据执行第一语音识别; 基于第一语音识别识别语境; 对偏向于上下文的音频数据执行第二语音识别; 并提供第二语音识别的输出。
-
-
-
-
-
-
-
-
-