-
公开(公告)号:US09129591B2
公开(公告)日:2015-09-08
申请号:US13726954
申请日:2012-12-26
Applicant: Google Inc.
Inventor: Yun-hsuan Sung , Francoise Beaufays , Brian Strope , Hui Lin , Jui-Ting Huang
IPC: G10L15/28 , G10L15/00 , G10L15/32 , G10L15/183
CPC classification number: G10L15/005 , G10L15/183 , G10L15/32
Abstract: Speech recognition systems may perform the following operations: receiving audio; recognizing the audio using language models for different languages to produce recognition candidates for the audio, where the recognition candidates are associated with corresponding recognition scores; identifying a candidate language for the audio; selecting a recognition candidate based on the recognition scores and the candidate language; and outputting data corresponding to the selected recognition candidate as a recognized version of the audio.
Abstract translation: 语音识别系统可以执行以下操作:接收音频; 使用不同语言的语言模型识别音频以产生用于音频的识别候选,其中识别候选与相应的识别分数相关联; 识别音频的候选语言; 基于识别分数和候选语言选择识别候选; 并输出与所选择的识别候选对应的数据作为音频的识别版本。
-
公开(公告)号:US11373086B2
公开(公告)日:2022-06-28
申请号:US15476292
申请日:2017-03-31
Applicant: Google Inc.
Inventor: Brian Strope , Yun-hsuan Sung , Matthew Henderson , Rami Al-Rfou′ , Raymond Kurzweil
IPC: G06N3/04 , H04L51/02 , G06N3/08 , H04L51/046
Abstract: Systems, methods, and computer readable media related to determining one or more responses to provide that are responsive to an electronic communication that is generated through interaction with a client computing device. For example, determining one or more responses to provide for presentation to a user as suggestions for inclusion in a reply to an electronic communication sent to the user. Some implementations are related to training and/or using separate input and response neural network models for determining responses for electronic communications. The input neural network model and the response neural network model can be separate, but trained and/or used cooperatively.
-
公开(公告)号:US20180240014A1
公开(公告)日:2018-08-23
申请号:US15476292
申请日:2017-03-31
Applicant: Google Inc.
Inventor: Brian Strope , Yun-hsuan Sung , Matthew Henderson , Rami Al-Rfou' , Raymond Kurzweil
CPC classification number: G06N3/0454 , G06N3/084 , H04L51/02 , H04L51/046
Abstract: Systems, methods, and computer readable media related to determining one or more responses to provide that are responsive to an electronic communication that is generated through interaction with a client computing device. For example, determining one or more responses to provide for presentation to a user as suggestions for inclusion in a reply to an electronic communication sent to the user. Some implementations are related to training and/or using separate input and response neural network models for determining responses for electronic communications. The input neural network model and the response neural network model can be separate, but trained and/or used cooperatively.
-
公开(公告)号:US09275635B1
公开(公告)日:2016-03-01
申请号:US13672945
申请日:2012-11-09
Applicant: Google Inc.
Inventor: Francoise Beaufays , Brian Strope , Yun-hsuan Sung
IPC: G10L15/00
CPC classification number: G10L15/32 , G10L15/183
Abstract: Speech recognition systems may perform the following operations: receiving audio at a computing device; identifying a language associated with the audio; recognizing the audio using recognition models for different versions of the language to produce recognition candidates for the audio, where the recognition candidates are associated with corresponding information; comparing the information of the recognition candidates to identify agreement between at least two of the recognition models; selecting a recognition candidate based on information of the recognition candidate and agreement between the at least two of the recognition models; and outputting data corresponding to the selected recognition candidate as a recognized version of the audio.
Abstract translation: 语音识别系统可以执行以下操作:在计算设备处接收音频; 识别与音频相关联的语言; 使用用于不同版本的语言的识别模型来识别音频以产生用于音频的识别候选,其中识别候选者与对应的信息相关联; 比较识别候选者的信息以识别至少两个识别模型之间的一致性; 基于所述识别候选者的信息和所述至少两个识别模型之间的一致性来选择识别候选者; 并输出与所选择的识别候选对应的数据作为音频的识别版本。
-
5.
公开(公告)号:US20180240013A1
公开(公告)日:2018-08-23
申请号:US15476280
申请日:2017-03-31
Applicant: Google Inc.
Inventor: Brian Strope , Yun-hsuan Sung , Matthew Henderson , Rami Al-Rfou' , Raymond Kurzweil
CPC classification number: G06N3/084 , G06F16/00 , G06F16/335 , G06N3/0445 , G06N3/0454 , G06N5/04
Abstract: Systems, methods, and computer readable media related to information retrieval. Some implementations are related to training and/or using a relevance model for information retrieval. The relevance model includes an input neural network model and a subsequent content neural network model. The input neural network model and the subsequent content neural network model can be separate, but trained and/or used cooperatively. The input neural network model and the subsequent content neural network model can be “separate” in that separate inputs are applied to the neural network models, and each of the neural network models is used to generate its own feature vector based on its applied input. A comparison of the feature vectors generated based on the separate network models can then be performed, where the comparison indicates relevance of the input applied to the input neural network model to the separate input applied to the subsequent content neural network model.
-
公开(公告)号:US20130238336A1
公开(公告)日:2013-09-12
申请号:US13726954
申请日:2012-12-26
Applicant: GOOGLE INC.
Inventor: Yun-hsuan Sung , Francoise Beaufays , Brian Strope , Hui Lin , Jui-Ting Huang
IPC: G10L15/00
CPC classification number: G10L15/005 , G10L15/183 , G10L15/32
Abstract: Speech recognition systems may perform the following operations: receiving audio; recognizing the audio using language models for different languages to produce recognition candidates for the audio, where the recognition candidates are associated with corresponding recognition scores; identifying a candidate language for the audio; selecting a recognition candidate based on the recognition scores and the candidate language; and outputting data corresponding to the selected recognition candidate as a recognized version of the audio.
Abstract translation: 语音识别系统可以执行以下操作:接收音频; 使用不同语言的语言模型识别音频以产生用于音频的识别候选,其中识别候选与相应的识别分数相关联; 识别音频的候选语言; 基于识别分数和候选语言选择识别候选; 并输出与所选择的识别候选对应的数据作为音频的识别版本。
-
公开(公告)号:US11188824B2
公开(公告)日:2021-11-30
申请号:US15476280
申请日:2017-03-31
Applicant: Google Inc.
Inventor: Brian Strope , Yun-hsuan Sung , Matthew Henderson , Rami Al-Rfou' , Raymond Kurzweil
IPC: G06N3/08 , G06N5/04 , G06F16/00 , G06N3/04 , G06F16/335
Abstract: Systems, methods, and computer readable media related to information retrieval. Some implementations are related to training and/or using a relevance model for information retrieval. The relevance model includes an input neural network model and a subsequent content neural network model. The input neural network model and the subsequent content neural network model can be separate, but trained and/or used cooperatively. The input neural network model and the subsequent content neural network model can be “separate” in that separate inputs are applied to the neural network models, and each of the neural network models is used to generate its own feature vector based on its applied input. A comparison of the feature vectors generated based on the separate network models can then be performed, where the comparison indicates relevance of the input applied to the input neural network model to the separate input applied to the subsequent content neural network model.
-
公开(公告)号:US09460088B1
公开(公告)日:2016-10-04
申请号:US13906654
申请日:2013-05-31
Applicant: Google Inc.
Inventor: Hasim Sak , Yun-hsuan Sung , Cyril Georges Luc Allauzen
IPC: G06F17/28 , G06F17/27 , G10L15/26 , G10L15/28 , G10L15/06 , G10L15/14 , G10L15/04 , G10L19/00 , G10L21/00 , G10L25/00
CPC classification number: G06F17/2881 , G06F17/2765 , G10L15/19
Abstract: An automatic speech recognition system and method are provided for written-domain language modeling. According to one implementation, a process includes accessing decomposed training data that results from applying rewrite grammar rules to original training data, the decomposed training data comprising (i) regular words from the original training data that have not been rewritten using the set of rewrite grammar rules, and (ii) decomposed segments that result from rewriting non-lexical entities from the original training data using the rewrite grammar rules, generating a restriction model that (i) maps language model paths for regular words to themselves, and (ii) restricts language model paths for decomposed segments for non-lexical entities, training a n-gram language model over the training data, composing the restriction model and the language model to obtain a restricted language model, and constructing a decoding network by composing a context dependency model and a pronunciation lexicon with the restricted language model.
Abstract translation: 提供了一种用于书面域语言建模的自动语音识别系统和方法。 根据一个实施方式,一个过程包括访问由重写语法规则应用于原始训练数据而产生的分解的训练数据,分解的训练数据包括(i)来自原始训练数据的常规单词,该原始训练数据未被重写使用该组重写语法 规则,和(ii)使用重写语法规则从原始训练数据重写非词汇实体产生的分段,生成限制模型,其将(i)将常规单词的语言模型路径映射到自身,以及(ii)限制 用于非词汇实体的分解段的语言模型路径,训练训练数据上的n-gram语言模型,组成限制模型和语言模型以获得受限语言模型,以及通过组合上下文依赖模型构建解码网络 和具有受限语言模型的发音词典。
-
-
-
-
-
-
-