Patent search ap:("GOOGLE LLC") AND inv:"Michiel A.U. Bacchiani" Page 2

11.

发明申请
ASYNCHRONOUS OPTIMIZATION FOR SEQUENCE TRAINING OF NEURAL NETWORKS 审中-公开

公开(公告)号：US20200258500A1

公开(公告)日：2020-08-13

申请号：US16863432

申请日：2020-04-30

Applicant: Google LLC

Inventor： Georg Heigold , Erik McDermott , Vincent O. Vanhoucke , Andrew W. Senior , Michiel A.U. Bacchiani

IPC: G10L15/06 , G10L15/16 , G10L15/183 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.

12.

发明申请
Hotword Suppression 审中-公开

公开(公告)号：US20190362719A1

公开(公告)日：2019-11-28

申请号：US16418415

申请日：2019-05-21

Applicant: Google LLC

Inventor： Alexander H. Gruenstein , Taral Pradeep Joglekar , Vijayaditya Peddinti , Michiel A.U. Bacchiani

IPC: G10L15/22 , G10L15/08 , G10L15/06 , G10L25/51 , G10L15/30

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for suppressing hotwords are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to playback of an utterance. The actions further include providing the audio data as an input to a model (i) that is configured to determine whether a given audio data sample includes an audio watermark and (ii) that was trained using watermarked audio data samples that each include an audio watermark sample and non-watermarked audio data samples that do not each include an audio watermark sample. The actions further include receiving, from the model, data indicating whether the audio data includes the audio watermark. The actions further include, based on the data indicating whether the audio data includes the audio watermark, determining to continue or cease processing of the audio data.

13.

发明申请
ASYNCHRONOUS OPTIMIZATION FOR SEQUENCE TRAINING OF NEURAL NETWORKS 审中-公开

公开(公告)号：US20180261204A1

公开(公告)日：2018-09-13

申请号：US15910720

申请日：2018-03-02

Applicant: Google LLC.

Inventor： Georg Heigold , Erik McDermott , Vincent O. Vanhoucke , Andrew W. Senior , Michiel A.U. Bacchiani

IPC: G10L15/06 , G10L15/183 , G10L15/16

CPC classification number: G10L15/063 , G06N3/0454 , G10L15/16 , G10L15/183

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.

14.

发明申请
COMPLEX LINEAR PROJECTION FOR ACOUSTIC MODELING 审中-公开

公开(公告)号：US20180174575A1

公开(公告)日：2018-06-21

申请号：US15386979

申请日：2016-12-21

Applicant: Google LLC

Inventor： Samuel Bengio , Mirko Visontai , Christopher Walter George Thornton , Michiel A.U. Bacchiani , Tara N. Sainath , Ehsan Variani , Izhak Shafran

IPC: G10L15/16 , G10L19/02 , G10L15/02

CPC classification number: G10L15/16 , G10H1/00 , G10H2210/036 , G10H2210/046 , G10H2250/235 , G10H2250/311 , G10L15/02 , G10L17/18

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition using complex linear projection are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to an utterance. The method further includes generating frequency domain data using the audio data. The method further includes processing the frequency domain data using complex linear projection. The method further includes providing the processed frequency domain data to a neural network trained as an acoustic model. The method further includes generating a transcription for the utterance that is determined based at least on output that the neural network provides in response to receiving the processed frequency domain data.

15.

发明申请
ADAPTIVE AUDIO ENHANCEMENT FOR MULTICHANNEL SPEECH RECOGNITION 有权

公开(公告)号：US20220148582A1

公开(公告)日：2022-05-12

申请号：US17649058

申请日：2022-01-26

Applicant: Google LLC

Inventor： Bo Li , Ron Weiss , Michiel A.U. Bacchiani , Tara N. Sainath , Kevin William Wilson

IPC: G10L15/16 , G10L15/20 , G10L21/0224

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for neural network adaptive beamforming for multichannel speech recognition are disclosed. In one aspect, a method includes the actions of receiving a first channel of audio data corresponding to an utterance and a second channel of audio data corresponding to the utterance. The actions further include generating a first set of filter parameters for a first filter based on the first channel of audio data and the second channel of audio data and a second set of filter parameters for a second filter based on the first channel of audio data and the second channel of audio data. The actions further include generating a single combined channel of audio data. The actions further include inputting the audio data to a neural network. The actions further include providing a transcription for the utterance.

16.

发明申请
ASYNCHRONOUS OPTIMIZATION FOR SEQUENCE TRAINING OF NEURAL NETWORKS 有权

公开(公告)号：US20220108686A1

公开(公告)日：2022-04-07

申请号：US17644362

申请日：2021-12-15

Applicant: Google LLC

Inventor： Georg Heigold , Erik McDermott , Vincent O. VanHoucke , Andrew W. Senior , Michiel A.U. Bacchiani

IPC: G10L15/06 , G10L15/16 , G10L15/183 , G06N3/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.

17.

发明申请
QUERY ENDPOINTING BASED ON LIP DETECTION 审中-公开

公开(公告)号：US20190333507A1

公开(公告)日：2019-10-31

申请号：US16412677

申请日：2019-05-15

Applicant: Google LLC

Inventor： Chanwoo Kim , Rajeev Conrad Nongpiur , Michiel A.U. Bacchiani

IPC: G10L15/22 , G10L15/25 , G10L25/78 , G06K9/00 , G10L15/04 , G10L15/26

Abstract: Systems and methods are described for improving endpoint detection of a voice query submitted by a user. In some implementations, a synchronized video data and audio data is received. A sequence of frames of the video data that includes images corresponding to lip movement on a face is determined. The audio data is endpointed based on first audio data that corresponds to a first frame of the sequence of frames and second audio data that corresponds to a last frame of the sequence of frames. A transcription of the endpointed audio data is generated by an automated speech recognizer. The generated transcription is then provided for output.

18.

发明申请
COMPLEX LINEAR PROJECTION FOR ACOUSTIC MODELING 审中-公开

公开(公告)号：US20190115013A1

公开(公告)日：2019-04-18

申请号：US16171629

申请日：2018-10-26

Applicant: Google LLC

Inventor： Samuel Bengio , Mirko Visontai , Christopher Walter George Thornton , Michiel A.U. Bacchiani , Tara N. Sainath , Ehsan Variani , Izhak Shafran

IPC: G10L15/16 , G10L19/02 , G10L15/02 , G10H1/00

CPC classification number: G10L15/16 , G10H1/00 , G10H2210/036 , G10H2210/046 , G10H2250/235 , G10H2250/311 , G10L15/02 , G10L17/18 , G10L19/0212

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition using complex linear projection are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to an utterance. The method further includes generating frequency domain data using the audio data. The method further includes processing the frequency domain data using complex linear projection. The method further includes providing the processed frequency domain data to a neural network trained as an acoustic model. The method further includes generating a transcription for the utterance that is determined based at least on output that the neural network provides in response to receiving the processed frequency domain data.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification