ASYNCHRONOUS OPTIMIZATION FOR SEQUENCE TRAINING OF NEURAL NETWORKS

    公开(公告)号:US20210125601A1

    公开(公告)日:2021-04-29

    申请号:US17143140

    申请日:2021-01-06

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.

    Latency constraints for acoustic modeling

    公开(公告)号:US10733979B2

    公开(公告)日:2020-08-04

    申请号:US14879225

    申请日:2015-10-09

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for acoustic modeling of audio data. One method includes receiving audio data representing a portion of an utterance, providing the audio data to a trained recurrent neural network that has been trained to indicate the occurrence of a phone at any of multiple time frames within a maximum delay of receiving audio data corresponding to the phone, receiving, within the predetermined maximum delay of providing the audio data to the trained recurrent neural network, output of the trained neural network indicating a phone corresponding to the provided audio data using output of the trained neural network to determine a transcription for the utterance, and providing the transcription for the utterance.

    ASYNCHRONOUS OPTIMIZATION FOR SEQUENCE TRAINING OF NEURAL NETWORKS

    公开(公告)号:US20200118549A1

    公开(公告)日:2020-04-16

    申请号:US16573323

    申请日:2019-09-17

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.

    Asynchronous optimization for sequence training of neural networks

    公开(公告)号:US10482873B2

    公开(公告)日:2019-11-19

    申请号:US15910720

    申请日:2018-03-02

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.

    Speech recognition using neural networks

    公开(公告)号:US10438581B2

    公开(公告)日:2019-10-08

    申请号:US13955483

    申请日:2013-07-31

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition using neural networks. A feature vector that models audio characteristics of a portion of an utterance is received. Data indicative of latent variables of multivariate factor analysis is received. The feature vector and the data indicative of the latent variables is provided as input to a neural network. A candidate transcription for the utterance is determined based on at least an output of the neural network.

    GENERATING REPRESENTATIONS OF ACOUSTIC SEQUENCES

    公开(公告)号:US20190139536A1

    公开(公告)日:2019-05-09

    申请号:US16179801

    申请日:2018-11-02

    Applicant: Google LLC

    CPC classification number: G10L15/16 G10L15/02 G10L15/142 G10L2015/025

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating representation of acoustic sequences. One of the methods includes: receiving an acoustic sequence, the acoustic sequence comprising a respective acoustic feature representation at each of a plurality of time steps; processing the acoustic feature representation at an initial time step using an acoustic modeling neural network; for each subsequent time step of the plurality of time steps: receiving an output generated by the acoustic modeling neural network for a preceding time step, generating a modified input from the output generated by the acoustic modeling neural network for the preceding time step and the acoustic representation for the time step, and processing the modified input using the acoustic modeling neural network to generate an output for the time step; and generating a phoneme representation for the utterance from the outputs for each of the time steps.

    QUANTUM ERROR CORRECTION USING NEURAL NETWORKS

    公开(公告)号:US20250068954A1

    公开(公告)日:2025-02-27

    申请号:US18237323

    申请日:2023-08-23

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting errors in a computation performed by a quantum computer. In one aspect, a method comprises obtaining error correction data for each of a plurality of time steps during the computation; initializing a decoder state; and for each of a plurality of updating time steps, wherein each updating time step corresponds to one or more of the time steps: generating an intermediate representation; and processing a time step input through a Transformer neural network to update the decoder state for the updating time step. The method comprises generating a prediction of whether an error occurred in the computation from the decoder state for the last updating time step of the plurality of updating time steps.

    QUANTUM ERROR CORRECTION USING NEURAL NETWORKS

    公开(公告)号:US20250068953A1

    公开(公告)日:2025-02-27

    申请号:US18237204

    申请日:2023-08-23

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting errors in a computation performed by a quantum computer. In one aspect, a method comprises obtaining error correction data for each of a plurality of time steps during the computation; initializing a decoder state; and for each of the plurality of time steps: generating an intermediate representation; and processing a time step input through a Transformer neural network to update the decoder state for the time step. The method comprises generating a prediction of whether an error occurred in the computation from the decoder state for the last time step of the plurality of time steps.

    Training acoustic models using connectionist temporal classification

    公开(公告)号:US11769493B2

    公开(公告)日:2023-09-26

    申请号:US17661794

    申请日:2022-05-03

    Applicant: Google LLC

    CPC classification number: G10L15/16 G10L15/187 G10L15/30 G10L2015/022

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training acoustic models and using the trained acoustic models. A connectionist temporal classification (CTC) acoustic model is accessed, the CTC acoustic model having been trained using a context-dependent state inventory generated from approximate phonetic alignments determined by another CTC acoustic model trained without fixed alignment targets. Audio data for a portion of an utterance is received. Input data corresponding to the received audio data is provided to the accessed CTC acoustic model. Data indicating a transcription for the utterance is generated based on output that the accessed CTC acoustic model produced in response to the input data. The data indicating the transcription is provided as output of an automated speech recognition service.

Patent Agency Ranking