Patent search ap:("Google Inc.") AND inv:"Michiel A. U. Bacchiani" Page 1

1.

发明授权
Asynchronous optimization for sequence training of neural networks 有权

公开(公告)号：US10019985B2

公开(公告)日：2018-07-10

申请号：US14258139

申请日：2014-04-22

Applicant: Google Inc.

Inventor： Georg Heigold , Erik McDermott , Vincent O. Vanhoucke , Andrew W. Senior , Michiel A. U. Bacchiani

IPC: G10L15/06 , G10L15/16 , G10L15/183

CPC classification number: G10L15/063 , G06N3/0454 , G10L15/16 , G10L15/183

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.

2.

发明授权
Adaptive audio enhancement for multichannel speech recognition 有权

公开(公告)号：US09886949B2

公开(公告)日：2018-02-06

申请号：US15392122

申请日：2016-12-28

Applicant: Google Inc.

Inventor： Bo Li , Ron J. Weiss , Michiel A. U. Bacchiani , Tara N. Sainath , Kevin William Wilson

IPC: G10L15/00 , G10L15/16 , G10L21/0224 , G10L21/0216 , G10L15/26

CPC classification number: G10L15/16 , G10L15/20 , G10L15/26 , G10L21/0224 , G10L2021/02166

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for neural network adaptive beamforming for multichannel speech recognition are disclosed. In one aspect, a method includes the actions of receiving a first channel of audio data corresponding to an utterance and a second channel of audio data corresponding to the utterance. The actions further include generating a first set of filter parameters for a first filter based on the first channel of audio data and the second channel of audio data and a second set of filter parameters for a second filter based on the first channel of audio data and the second channel of audio data. The actions further include generating a single combined channel of audio data. The actions further include inputting the audio data to a neural network. The actions further include providing a transcription for the utterance.

3.

发明授权
Processing multi-channel audio waveforms 有权

公开(公告)号：US09697826B2

公开(公告)日：2017-07-04

申请号：US15205321

申请日：2016-07-08

Applicant: Google Inc.

Inventor： Tara N. Sainath , Ron J. Weiss , Kevin William Wilson , Andrew W. Senior , Arun Narayanan , Yedid Hoshen , Michiel A. U. Bacchiani

IPC: G10L15/16 , G10L15/06 , G10L21/0216 , G10L15/02

CPC classification number: G10L15/16 , G06N3/0445 , G06N3/0454 , G10L15/02 , G10L15/063 , G10L2021/02166 , H04R3/005

Abstract: Methods, including computer programs encoded on a computer storage medium, for enhancing the processing of audio waveforms for speech recognition using various neural network processing techniques. In one aspect, a method includes: receiving multiple channels of audio data corresponding to an utterance; convolving each of multiple filters, in a time domain, with each of the multiple channels of audio waveform data to generate convolution outputs, wherein the multiple filters have parameters that have been learned during a training process that jointly trains the multiple filters and trains a deep neural network as an acoustic model; combining, for each of the multiple filters, the convolution outputs for the filter for the multiple channels of audio waveform data; inputting the combined convolution outputs to the deep neural network trained jointly with the multiple filters; and providing a transcription for the utterance that is determined.

4.

发明授权
Context-dependent state tying using a neural network 有权

公开(公告)号：US09620145B2

公开(公告)日：2017-04-11

申请号：US14282655

申请日：2014-05-20

Applicant: Google Inc.

Inventor： Michiel A. U. Bacchiani , David Rybach

IPC: G10L15/16 , G10L25/30 , G10L15/26 , G10L15/06 , G10L15/22 , G10L15/183

CPC classification number: G10L25/30 , G10L15/06 , G10L15/16 , G10L15/183 , G10L15/22 , G10L15/26

Abstract: The technology described herein can be embodied in a method that includes receiving an audio signal encoding a portion of an utterance, and providing, to a first neural network, data corresponding to the audio signal. The method also includes generating, by a processor, data representing a transcription for the utterance based on an output of the first neural network. The first neural network is trained using features of multiple context-dependent states, the context-dependent states being derived from a plurality of context-independent states provided by a second neural network.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification