Patent search ap:("Google LLC") AND inv:"Andrew W. Senior" Page 2

11.

发明授权
Generating representations of acoustic sequences 有权

公开(公告)号：US11721327B2

公开(公告)日：2023-08-08

申请号：US17145208

申请日：2021-01-08

Applicant: Google LLC

Inventor： Hasim Sak , Andrew W. Senior

IPC: G10L15/00 , G10L15/16 , G10L15/02 , G10L15/14

CPC classification number: G10L15/16 , G10L15/02 , G10L15/142 , G10L2015/025

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating representation of acoustic sequences. One of the methods includes: receiving an acoustic sequence, the acoustic sequence comprising a respective acoustic feature representation at each of a plurality of time steps; processing the acoustic feature representation at an initial time step using an acoustic modeling neural network; for each subsequent time step of the plurality of time steps: receiving an output generated by the acoustic modeling neural network for a preceding time step, generating a modified input from the output generated by the acoustic modeling neural network for the preceding time step and the acoustic representation for the time step, and processing the modified input using the acoustic modeling neural network to generate an output for the time step; and generating a phoneme representation for the utterance from the outputs for each of the time steps.

12.

发明授权
Convolutional, long short-term memory, fully connected deep neural networks 有权

公开(公告)号：US11715486B2

公开(公告)日：2023-08-01

申请号：US16731464

申请日：2019-12-31

Applicant: Google LLC

Inventor： Tara N. Sainath , Andrew W. Senior , Oriol Vinyals , Hasim Sak

IPC: G06N3/044 , G06N3/045 , G10L25/30 , G10L15/16 , G10L15/02

CPC classification number: G10L25/30 , G06N3/044 , G06N3/045 , G10L15/16 , G10L15/02

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for identifying the language of a spoken utterance. One of the methods includes receiving input features of an utterance; and processing the input features using an acoustic model that comprises one or more convolutional neural network (CNN) layers, one or more long short-term memory network (LSTM) layers, and one or more fully connected neural network layers to generate a transcription for the utterance.

13.

发明申请
TRAINING ACOUSTIC MODELS USING CONNECTIONIST TEMPORAL CLASSIFICATION 有权

公开(公告)号：US20220262350A1

公开(公告)日：2022-08-18

申请号：US17661794

申请日：2022-05-03

Applicant: Google LLC

Inventor： Kanury Kanishka Rao , Andrew W. Senior , Hasim Sak

IPC: G10L15/16 , G10L15/187

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training acoustic models and using the trained acoustic models. A connectionist temporal classification (CTC) acoustic model is accessed, the CTC acoustic model having been trained using a context-dependent state inventory generated from approximate phonetic alignments determined by another CTC acoustic model trained without fixed alignment targets. Audio data for a portion of an utterance is received. Input data corresponding to the received audio data is provided to the accessed CTC acoustic model. Data indicating a transcription for the utterance is generated based on output that the accessed CTC acoustic model produced in response to the input data. The data indicating the transcription is provided as output of an automated speech recognition service.

14.

发明申请
GENERATING REPRESENTATIONS OF ACOUSTIC SEQUENCES 有权

公开(公告)号：US20210134275A1

公开(公告)日：2021-05-06

申请号：US17145208

申请日：2021-01-08

Applicant: Google LLC

Inventor： Hasim Sak , Andrew W. Senior

IPC: G10L15/16 , G10L15/02

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating representation of acoustic sequences. One of the methods includes: receiving an acoustic sequence, the acoustic sequence comprising a respective acoustic feature representation at each of a plurality of time steps; processing the acoustic feature representation at an initial time step using an acoustic modeling neural network; for each subsequent time step of the plurality of time steps: receiving an output generated by the acoustic modeling neural network for a preceding time step, generating a modified input from the output generated by the acoustic modeling neural network for the preceding time step and the acoustic representation for the time step, and processing the modified input using the acoustic modeling neural network to generate an output for the time step; and generating a phoneme representation for the utterance from the outputs for each of the time steps.

15.

发明授权
Speech recognition using neural networks 有权

公开(公告)号：US10930271B2

公开(公告)日：2021-02-23

申请号：US16573232

申请日：2019-09-17

Applicant: Google LLC

Inventor： Andrew W. Senior , Ignacio Lopez Moreno

IPC: G10L15/16 , G06N3/02 , G10L15/02

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition using neural networks. A feature vector that models audio characteristics of a portion of an utterance is received. Data indicative of latent variables of multivariate factor analysis is received. The feature vector and the data indicative of the latent variables is provided as input to a neural network. A candidate transcription for the utterance is determined based on at least an output of the neural network.

16.

发明申请
ASYNCHRONOUS OPTIMIZATION FOR SEQUENCE TRAINING OF NEURAL NETWORKS 审中-公开

公开(公告)号：US20200258500A1

公开(公告)日：2020-08-13

申请号：US16863432

申请日：2020-04-30

Applicant: Google LLC

Inventor： Georg Heigold , Erik McDermott , Vincent O. Vanhoucke , Andrew W. Senior , Michiel A.U. Bacchiani

IPC: G10L15/06 , G10L15/16 , G10L15/183 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.

17.

发明授权
Asynchronous optimization for sequence training of neural networks 审中-公开

公开(公告)号：US10672384B2

公开(公告)日：2020-06-02

申请号：US16573323

申请日：2019-09-17

Applicant: Google LLC

Inventor： Georg Heigold , Erik McDermott , Vincent O. Vanhoucke , Andrew W. Senior , Michiel A. U. Bacchiani

IPC: G10L15/06 , G10L15/16 , G10L15/183 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.

18.

发明授权
Processing audio waveforms 有权

公开(公告)号：US10403269B2

公开(公告)日：2019-09-03

申请号：US15080927

申请日：2016-03-25

Applicant: Google LLC

Inventor： Tara N. Sainath , Ron J. Weiss , Andrew W. Senior , Kevin William Wilson

IPC: G10L15/16 , G06N3/04 , G06N3/08 , G10L15/26 , G10L15/14

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing audio waveforms. In some implementations, a time-frequency feature representation is generated based on audio data. The time-frequency feature representation is input to an acoustic model comprising a trained artificial neural network. The trained artificial neural network comprising a frequency convolution layer, a memory layer, and one or more hidden layers. An output that is based on output of the trained artificial neural network is received. A transcription is provided, where the transcription is determined based on the output of the acoustic model.

19.

发明申请
ASYNCHRONOUS OPTIMIZATION FOR SEQUENCE TRAINING OF NEURAL NETWORKS 审中-公开

公开(公告)号：US20180261204A1

公开(公告)日：2018-09-13

申请号：US15910720

申请日：2018-03-02

Applicant: Google LLC.

Inventor： Georg Heigold , Erik McDermott , Vincent O. Vanhoucke , Andrew W. Senior , Michiel A.U. Bacchiani

IPC: G10L15/06 , G10L15/183 , G10L15/16

CPC classification number: G10L15/063 , G06N3/0454 , G10L15/16 , G10L15/183

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.

20.

发明授权
Processing acoustic sequences using long short-term memory (LSTM) neural networks that include recurrent projection layers 有权

公开(公告)号：US10026397B2

公开(公告)日：2018-07-17

申请号：US15454407

申请日：2017-03-09

Applicant: Google LLC

Inventor： Hasim Sak , Andrew W. Senior

IPC: G10L15/16 , G10L15/02 , G10L15/14

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating phoneme representations of acoustic sequences using projection sequences. One of the methods includes receiving an acoustic sequence, the acoustic sequence representing an utterance, and the acoustic sequence comprising a respective acoustic feature representation at each of a plurality of time steps; for each of the plurality of time steps, processing the acoustic feature representation through each of one or more long short-term memory (LSTM) layers; and for each of the plurality of time steps, processing the recurrent projected output generated by the highest LSTM layer for the time step using an output layer to generate a set of scores for the time step.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification