-
公开(公告)号:US09299347B1
公开(公告)日:2016-03-29
申请号:US14685790
申请日:2015-04-14
Applicant: Google Inc.
Inventor: Olivier Siohan , Pedro J. Moreno Mengibar
CPC classification number: G10L15/10 , G10L15/02 , G10L15/20 , G10L15/265 , G10L21/0308 , G10L2015/025
Abstract: Methods, systems, and apparatus are described that receive audio data for an utterance. Association data is accessed that indicates associations between data corresponding to uncorrupted audio segments, and data corresponding to corrupted versions of the uncorrupted audio segments, where the associations are determined before receiving the audio data for the utterance. Using the association data and the received audio data for the utterance, data corresponding to at least one uncorrupted audio segment is selected. A transcription of the utterance is determined based on the selected data corresponding to the at least one uncorrupted audio segment.
Abstract translation: 描述了接收用于话语的音频数据的方法,系统和装置。 访问关联数据,其指示对应于未损坏的音频片段的数据之间的关联,以及对应于未被破坏的音频段的损坏版本的数据,其中在接收用于话语的音频数据之前确定关联。 使用关联数据和所接收的音频数据进行发音,选择对应于至少一个未被破坏的音频段的数据。 基于与至少一个未损坏的音频段相对应的所选数据来确定话音的转录。
-
公开(公告)号:US20160267903A1
公开(公告)日:2016-09-15
申请号:US15164263
申请日:2016-05-25
Applicant: Google Inc.
Inventor: Olga Kapralova , John Paul Alex , Eugene Weinstein , Pedro J. Moreno Mengibar , Olivier Siohan , Ignacio Lopez Moreno
CPC classification number: G10L15/063 , G10L15/01 , G10L15/16 , G10L15/187 , G10L15/30 , G10L15/32 , G10L2015/0633
Abstract: The present disclosure relates to training a speech recognition system. One example method includes receiving a collection of speech data items, wherein each speech data item corresponds to an utterance that was previously submitted for transcription by a production speech recognizer. The production speech recognizer uses initial production speech recognizer components in generating transcriptions of speech data items. A transcription for each speech data item is generated using an offline speech recognizer, and the offline speech recognizer components are configured to improve speech recognition accuracy in comparison with the initial production speech recognizer components. The updated production speech recognizer components are trained for the production speech recognizer using a selected subset of the transcriptions of the speech data items generated by the offline speech recognizer. An updated production speech recognizer component is provided to the production speech recognizer for use in transcribing subsequently received speech data items.
-
3.
公开(公告)号:US20160093294A1
公开(公告)日:2016-03-31
申请号:US14693268
申请日:2015-04-22
Applicant: Google Inc.
Inventor: Olga Kapralova , John Paul Alex , Eugene Weinstein , Pedro J. Moreno Mengibar , Olivier Siohan , Ignacio Lopez Moreno
IPC: G10L15/06 , G10L15/187 , G10L15/26 , G10L25/30
CPC classification number: G10L15/063 , G10L15/01 , G10L15/16 , G10L15/187 , G10L15/30 , G10L15/32 , G10L2015/0633
Abstract: The present disclosure relates to training a speech recognition system. One example method includes receiving a collection of speech data items, wherein each speech data item corresponds to an utterance that was previously submitted for transcription by a production speech recognizer. The production speech recognizer uses initial production speech recognizer components in generating transcriptions of speech data items. A transcription for each speech data item is generated using an offline speech recognizer, and the offline speech recognizer components are configured to improve speech recognition accuracy in comparison with the initial production speech recognizer components. The updated production speech recognizer components are trained for the production speech recognizer using a selected subset of the transcriptions of the speech data items generated by the offline speech recognizer. An updated production speech recognizer component is provided to the production speech recognizer for use in transcribing subsequently received speech data items.
Abstract translation: 本公开涉及训练语音识别系统。 一个示例性方法包括接收语音数据项集合,其中每个语音数据项对应于先前由生产语音识别器提交用于转录的话语。 生产语音识别器使用初始生产语音识别器组件来产生语音数据项的转录。 使用离线语音识别器生成每个语音数据项的转录,并且将离线语音识别器组件配置为与初始制作语音识别器组件相比提高语音识别精度。 使用由离线语音识别器生成的语音数据项的转录的所选择的子集来对生产语音识别器进行更新的制作语音识别器组件的训练。 更新的生产语音识别器组件被提供给生产语音识别器,用于转录随后接收的语音数据项。
-
公开(公告)号:US08571860B2
公开(公告)日:2013-10-29
申请号:US13750807
申请日:2013-01-25
Applicant: Google Inc.
Inventor: Brian Strope , Francoise Beaufays , Olivier Siohan
IPC: G10L15/00
Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.
Abstract translation: 除了别的以外,本说明书的主题可以体现在包括通过多个语音识别系统(SRS)接收音频信号和发起语音识别任务的方法。 每个SRS被配置为产生指定包括在音频信号中的可能语音的识别结果,以及指示对语音结果的正确性置信度的置信度值。 该方法还包括完成语音识别任务的一部分,包括生成一个或多个识别结果和一个或多个识别结果的一个或多个置信度值,确定一个或多个置信度值是否满足置信阈值,中止其余部分 的没有产生识别结果的SRS的语音识别任务,并且基于所生成的一个或多个语音结果中的至少一个来输出最终识别结果。
-
公开(公告)号:US20160275951A1
公开(公告)日:2016-09-22
申请号:US15171374
申请日:2016-06-02
Applicant: Google Inc.
Inventor: Brian Patrick Strope , Francoise Beaufays , Olivier Siohan
Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.
Abstract translation: 除了别的以外,本说明书的主题可以体现在包括通过多个语音识别系统(SRS)接收音频信号和发起语音识别任务的方法。 每个SRS被配置为产生指定包括在音频信号中的可能语音的识别结果,以及指示对语音结果的正确性置信度的置信度值。 该方法还包括完成语音识别任务的一部分,包括生成一个或多个识别结果和一个或多个识别结果的一个或多个置信度值,确定一个或多个置信度值是否满足置信阈值,中止其余部分 的没有产生识别结果的SRS的语音识别任务,并且基于所生成的一个或多个语音结果中的至少一个来输出最终识别结果。
-
6.
公开(公告)号:US09263033B2
公开(公告)日:2016-02-16
申请号:US14314295
申请日:2014-06-25
Applicant: Google Inc.
Inventor: Olivier Siohan , Pedro J. Mengibar
CPC classification number: G10L15/063 , G10L2015/0635
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a set of training utterances. The methods, systems, and apparatus include actions of obtaining a target multi-dimensional distribution of characteristics in an initial set of candidate utterances and selecting a subset of the initial set of candidate utterances based on speech recognition confidence scores associated with the candidate utterances. Additional actions include selecting a particular candidate utterance from the subset of the initial set of utterances and determining that adding the particular candidate utterance to a set of training utterances reduces a divergence of a multi-dimensional distribution of the characteristics in the set of training utterances from the target multi-dimensional distribution. Further actions include adding the particular candidate utterance to the set of training utterances.
Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于产生一组训练话语。 方法,系统和装置包括在初始的候选话语集中获得特征的目标多维分布的动作,并且基于与候选话语相关联的语音识别置信度得分来选择候选话语的初始集合的子集。 附加动作包括从初始话语集合的子集中选择特定的候选话语,并确定将特定候选话语添加到一组训练话语中减少了训练语言组中的特征的多维分布的发散, 目标多维分布。 进一步的行动包括将特定候选人的话语添加到一组训练话语中。
-
公开(公告)号:US09378731B2
公开(公告)日:2016-06-28
申请号:US14693268
申请日:2015-04-22
Applicant: Google Inc.
Inventor: Olga Kapralova , John Paul Alex , Eugene Weinstein , Pedro J. Moreno Mengibar , Olivier Siohan , Ignacio Lopez Moreno
IPC: G10L15/00 , G10L15/06 , G10L25/30 , G10L15/187 , G10L15/26
CPC classification number: G10L15/063 , G10L15/01 , G10L15/16 , G10L15/187 , G10L15/30 , G10L15/32 , G10L2015/0633
Abstract: The present disclosure relates to training a speech recognition system. One example method includes receiving a collection of speech data items, wherein each speech data item corresponds to an utterance that was previously submitted for transcription by a production speech recognizer. The production speech recognizer uses initial production speech recognizer components in generating transcriptions of speech data items. A transcription for each speech data item is generated using an offline speech recognizer, and the offline speech recognizer components are configured to improve speech recognition accuracy in comparison with the initial production speech recognizer components. The updated production speech recognizer components are trained for the production speech recognizer using a selected subset of the transcriptions of the speech data items generated by the offline speech recognizer. An updated production speech recognizer component is provided to the production speech recognizer for use in transcribing subsequently received speech data items.
Abstract translation: 本公开涉及训练语音识别系统。 一个示例性方法包括接收语音数据项集合,其中每个语音数据项对应于先前由生产语音识别器提交用于转录的话语。 生产语音识别器使用初始生产语音识别器组件来产生语音数据项的转录。 使用离线语音识别器生成每个语音数据项的转录,并且将离线语音识别器组件配置为与初始制作语音识别器组件相比提高语音识别精度。 使用由离线语音识别器生成的语音数据项的转录的所选择的子集来对生产语音识别器进行更新的制作语音识别器组件的训练。 更新的生产语音识别器组件被提供给生产语音识别器,用于转录随后接收的语音数据项。
-
公开(公告)号:US09373329B2
公开(公告)日:2016-06-21
申请号:US14064755
申请日:2013-10-28
Applicant: Google Inc.
Inventor: Brian Strope , Francoise Beaufays , Olivier Siohan
Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.
-
公开(公告)号:US20140058728A1
公开(公告)日:2014-02-27
申请号:US14064755
申请日:2013-10-28
Applicant: Google Inc.
Inventor: Brian Strope , Francoise Beaufays , Olivier Siohan
IPC: G10L15/26
Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.
Abstract translation: 除了别的以外,本说明书的主题可以体现在包括通过多个语音识别系统(SRS)接收音频信号和发起语音识别任务的方法。 每个SRS被配置为产生指定包括在音频信号中的可能语音的识别结果,以及指示对语音结果的正确性置信度的置信度值。 该方法还包括完成语音识别任务的一部分,包括生成一个或多个识别结果和一个或多个识别结果的一个或多个置信度值,确定一个或多个置信度值是否满足置信阈值,中止其余部分 的没有产生识别结果的SRS的语音识别任务,并且基于所生成的一个或多个语音结果中的至少一个输出最终识别结果。
-
公开(公告)号:US09472187B2
公开(公告)日:2016-10-18
申请号:US15164263
申请日:2016-05-25
Applicant: Google Inc.
Inventor: Olga Kapralova , John Paul Alex , Eugene Weinstein , Pedro J. Moreno Mengibar , Olivier Siohan , Ignacio Lopez Moreno
CPC classification number: G10L15/063 , G10L15/01 , G10L15/16 , G10L15/187 , G10L15/30 , G10L15/32 , G10L2015/0633
Abstract: The present disclosure relates to training a speech recognition system. One example method includes receiving a collection of speech data items, wherein each speech data item corresponds to an utterance that was previously submitted for transcription by a production speech recognizer. The production speech recognizer uses initial production speech recognizer components in generating transcriptions of speech data items. A transcription for each speech data item is generated using an offline speech recognizer, and the offline speech recognizer components are configured to improve speech recognition accuracy in comparison with the initial production speech recognizer components. The updated production speech recognizer components are trained for the production speech recognizer using a selected subset of the transcriptions of the speech data items generated by the offline speech recognizer. An updated production speech recognizer component is provided to the production speech recognizer for use in transcribing subsequently received speech data items.
-
-
-
-
-
-
-
-
-