Speech recognition using associative mapping
    1.
    发明授权
    Speech recognition using associative mapping 有权
    使用关联映射的语音识别

    公开(公告)号:US09299347B1

    公开(公告)日:2016-03-29

    申请号:US14685790

    申请日:2015-04-14

    Applicant: Google Inc.

    Abstract: Methods, systems, and apparatus are described that receive audio data for an utterance. Association data is accessed that indicates associations between data corresponding to uncorrupted audio segments, and data corresponding to corrupted versions of the uncorrupted audio segments, where the associations are determined before receiving the audio data for the utterance. Using the association data and the received audio data for the utterance, data corresponding to at least one uncorrupted audio segment is selected. A transcription of the utterance is determined based on the selected data corresponding to the at least one uncorrupted audio segment.

    Abstract translation: 描述了接收用于话语的音频数据的方法,系统和装置。 访问关联数据,其指示对应于未损坏的音频片段的数据之间的关联,以及对应于未被破坏的音频段的损坏版本的数据,其中在接收用于话语的音频数据之前确定关联。 使用关联数据和所接收的音频数据进行发音,选择对应于至少一个未被破坏的音频段的数据。 基于与至少一个未损坏的音频段相对应的所选数据来确定话音的转录。

    ACOUSTIC MODEL TRAINING CORPUS SELECTION

    公开(公告)号:US20160267903A1

    公开(公告)日:2016-09-15

    申请号:US15164263

    申请日:2016-05-25

    Applicant: Google Inc.

    Abstract: The present disclosure relates to training a speech recognition system. One example method includes receiving a collection of speech data items, wherein each speech data item corresponds to an utterance that was previously submitted for transcription by a production speech recognizer. The production speech recognizer uses initial production speech recognizer components in generating transcriptions of speech data items. A transcription for each speech data item is generated using an offline speech recognizer, and the offline speech recognizer components are configured to improve speech recognition accuracy in comparison with the initial production speech recognizer components. The updated production speech recognizer components are trained for the production speech recognizer using a selected subset of the transcriptions of the speech data items generated by the offline speech recognizer. An updated production speech recognizer component is provided to the production speech recognizer for use in transcribing subsequently received speech data items.

    ACOUSTIC MODEL TRAINING CORPUS SELECTION
    3.
    发明申请
    ACOUSTIC MODEL TRAINING CORPUS SELECTION 有权
    ACOUSTIC MODEL TRAINING CORPUS选择

    公开(公告)号:US20160093294A1

    公开(公告)日:2016-03-31

    申请号:US14693268

    申请日:2015-04-22

    Applicant: Google Inc.

    Abstract: The present disclosure relates to training a speech recognition system. One example method includes receiving a collection of speech data items, wherein each speech data item corresponds to an utterance that was previously submitted for transcription by a production speech recognizer. The production speech recognizer uses initial production speech recognizer components in generating transcriptions of speech data items. A transcription for each speech data item is generated using an offline speech recognizer, and the offline speech recognizer components are configured to improve speech recognition accuracy in comparison with the initial production speech recognizer components. The updated production speech recognizer components are trained for the production speech recognizer using a selected subset of the transcriptions of the speech data items generated by the offline speech recognizer. An updated production speech recognizer component is provided to the production speech recognizer for use in transcribing subsequently received speech data items.

    Abstract translation: 本公开涉及训练语音识别系统。 一个示例性方法包括接收语音数据项集合,其中每个语音数据项对应于先前由生产语音识别器提交用于转录的话语。 生产语音识别器使用初始生产语音识别器组件来产生语音数据项的转录。 使用离线语音识别器生成每个语音数据项的转录,并且将离线语音识别器组件配置为与初始制作语音识别器组件相比提高语音识别精度。 使用由离线语音识别器生成的语音数据项的转录的所选择的子集来对生产语音识别器进行更新的制作语音识别器组件的训练。 更新的生产语音识别器组件被提供给生产语音识别器,用于转录随后接收的语音数据项。

    Speech recognition with parallel recognition tasks
    4.
    发明授权
    Speech recognition with parallel recognition tasks 有权
    具有并行识别任务的语音识别

    公开(公告)号:US08571860B2

    公开(公告)日:2013-10-29

    申请号:US13750807

    申请日:2013-01-25

    Applicant: Google Inc.

    CPC classification number: G10L15/32 G10L15/00 G10L15/01 G10L15/26

    Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.

    Abstract translation: 除了别的以外,本说明书的主题可以体现在包括通过多个语音识别系统(SRS)接收音频信号和发起语音识别任务的方法。 每个SRS被配置为产生指定包括在音频信号中的可能语音的识别结果,以及指示对语音结果的正确性置信度的置信度值。 该方法还包括完成语音识别任务的一部分,包括生成一个或多个识别结果和一个或多个识别结果的一个或多个置信度值,确定一个或多个置信度值是否满足置信阈值,中止其余部分 的没有产生识别结果的SRS的语音识别任务,并且基于所生成的一个或多个语音结果中的至少一个来输出最终识别结果。

    Speech Recognition with Parallel Recognition Tasks
    5.
    发明申请
    Speech Recognition with Parallel Recognition Tasks 审中-公开
    具有并行识别任务的语音识别

    公开(公告)号:US20160275951A1

    公开(公告)日:2016-09-22

    申请号:US15171374

    申请日:2016-06-02

    Applicant: Google Inc.

    CPC classification number: G10L15/32 G10L15/00 G10L15/01 G10L15/26

    Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.

    Abstract translation: 除了别的以外,本说明书的主题可以体现在包括通过多个语音识别系统(SRS)接收音频信号和发起语音识别任务的方法。 每个SRS被配置为产生指定包括在音频信号中的可能语音的识别结果,以及指示对语音结果的正确性置信度的置信度值。 该方法还包括完成语音识别任务的一部分,包括生成一个或多个识别结果和一个或多个识别结果的一个或多个置信度值,确定一个或多个置信度值是否满足置信阈值,中止其余部分 的没有产生识别结果的SRS的语音识别任务,并且基于所生成的一个或多个语音结果中的至少一个来输出最终识别结果。

    Utterance selection for automated speech recognizer training
    6.
    发明授权
    Utterance selection for automated speech recognizer training 有权
    自动语音识别器培训的话语选择

    公开(公告)号:US09263033B2

    公开(公告)日:2016-02-16

    申请号:US14314295

    申请日:2014-06-25

    Applicant: Google Inc.

    CPC classification number: G10L15/063 G10L2015/0635

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a set of training utterances. The methods, systems, and apparatus include actions of obtaining a target multi-dimensional distribution of characteristics in an initial set of candidate utterances and selecting a subset of the initial set of candidate utterances based on speech recognition confidence scores associated with the candidate utterances. Additional actions include selecting a particular candidate utterance from the subset of the initial set of utterances and determining that adding the particular candidate utterance to a set of training utterances reduces a divergence of a multi-dimensional distribution of the characteristics in the set of training utterances from the target multi-dimensional distribution. Further actions include adding the particular candidate utterance to the set of training utterances.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于产生一组训练话语。 方法,系统和装置包括在初始的候选话语集中获得特征的目标多维分布的动作,并且基于与候选话语相关联的语音识别置信度得分来选择候选话语的初始集合的子集。 附加动作包括从初始话语集合的子集中选择特定的候选话语,并确定将特定候选话语添加到一组训练话语中减少了训练语言组中的特征的多维分布的发散, 目标多维分布。 进一步的行动包括将特定候选人的话语添加到一组训练话语中。

    Acoustic model training corpus selection
    7.
    发明授权
    Acoustic model training corpus selection 有权
    声学模型训练语料库选择

    公开(公告)号:US09378731B2

    公开(公告)日:2016-06-28

    申请号:US14693268

    申请日:2015-04-22

    Applicant: Google Inc.

    Abstract: The present disclosure relates to training a speech recognition system. One example method includes receiving a collection of speech data items, wherein each speech data item corresponds to an utterance that was previously submitted for transcription by a production speech recognizer. The production speech recognizer uses initial production speech recognizer components in generating transcriptions of speech data items. A transcription for each speech data item is generated using an offline speech recognizer, and the offline speech recognizer components are configured to improve speech recognition accuracy in comparison with the initial production speech recognizer components. The updated production speech recognizer components are trained for the production speech recognizer using a selected subset of the transcriptions of the speech data items generated by the offline speech recognizer. An updated production speech recognizer component is provided to the production speech recognizer for use in transcribing subsequently received speech data items.

    Abstract translation: 本公开涉及训练语音识别系统。 一个示例性方法包括接收语音数据项集合,其中每个语音数据项对应于先前由生产语音识别器提交用于转录的话语。 生产语音识别器使用初始生产语音识别器组件来产生语音数据项的转录。 使用离线语音识别器生成每个语音数据项的转录,并且将离线语音识别器组件配置为与初始制作语音识别器组件相比提高语音识别精度。 使用由离线语音识别器生成的语音数据项的转录的所选择的子集来对生产语音识别器进行更新的制作语音识别器组件的训练。 更新的生产语音识别器组件被提供给生产语音识别器,用于转录随后接收的语音数据项。

    Speech recognition with parallel recognition tasks

    公开(公告)号:US09373329B2

    公开(公告)日:2016-06-21

    申请号:US14064755

    申请日:2013-10-28

    Applicant: Google Inc.

    CPC classification number: G10L15/32 G10L15/00 G10L15/01 G10L15/26

    Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.

    Speech Recognition with Parallel Recognition Tasks
    9.
    发明申请
    Speech Recognition with Parallel Recognition Tasks 有权
    具有并行识别任务的语音识别

    公开(公告)号:US20140058728A1

    公开(公告)日:2014-02-27

    申请号:US14064755

    申请日:2013-10-28

    Applicant: Google Inc.

    CPC classification number: G10L15/32 G10L15/00 G10L15/01 G10L15/26

    Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal and initiating speech recognition tasks by a plurality of speech recognition systems (SRS's). Each SRS is configured to generate a recognition result specifying possible speech included in the audio signal and a confidence value indicating a confidence in a correctness of the speech result. The method also includes completing a portion of the speech recognition tasks including generating one or more recognition results and one or more confidence values for the one or more recognition results, determining whether the one or more confidence values meets a confidence threshold, aborting a remaining portion of the speech recognition tasks for SRS's that have not generated a recognition result, and outputting a final recognition result based on at least one of the generated one or more speech results.

    Abstract translation: 除了别的以外,本说明书的主题可以体现在包括通过多个语音识别系统(SRS)接收音频信号和发起语音识别任务的方法。 每个SRS被配置为产生指定包括在音频信号中的可能语音的识别结果,以及指示对语音结果的正确性置信度的置信度值。 该方法还包括完成语音识别任务的一部分,包括生成一个或多个识别结果和一个或多个识别结果的一个或多个置信度值,确定一个或多个置信度值是否满足置信阈值,中止其余部分 的没有产生识别结果的SRS的语音识别任务,并且基于所生成的一个或多个语音结果中的至少一个输出最终识别结果。

    Acoustic model training corpus selection

    公开(公告)号:US09472187B2

    公开(公告)日:2016-10-18

    申请号:US15164263

    申请日:2016-05-25

    Applicant: Google Inc.

    Abstract: The present disclosure relates to training a speech recognition system. One example method includes receiving a collection of speech data items, wherein each speech data item corresponds to an utterance that was previously submitted for transcription by a production speech recognizer. The production speech recognizer uses initial production speech recognizer components in generating transcriptions of speech data items. A transcription for each speech data item is generated using an offline speech recognizer, and the offline speech recognizer components are configured to improve speech recognition accuracy in comparison with the initial production speech recognizer components. The updated production speech recognizer components are trained for the production speech recognizer using a selected subset of the transcriptions of the speech data items generated by the offline speech recognizer. An updated production speech recognizer component is provided to the production speech recognizer for use in transcribing subsequently received speech data items.

Patent Agency Ranking