Patent search ap:("Google Inc.") AND inv:"Bo Li" Page 1

1.

发明申请
AUTOMATIC SPEECH RECOGNITION USING MULTI-DIMENSIONAL MODELS 有权

公开(公告)号：US20180025721A1

公开(公告)日：2018-01-25

申请号：US15217457

申请日：2016-07-22

Applicant: Google Inc.

Inventor： Bo Li , Tara N. Sainath

IPC: G10L15/16 , G10L15/26 , G10L15/02 , G06N3/08

CPC classification number: G10L15/16 , G06N3/08 , G10L15/02 , G10L15/26 , G10L2015/025

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for automatic speech recognition using multi-dimensional models. In some implementations, audio data that describes an utterance is received. A transcription for the utterance is determined using an acoustic model that includes a neural network having first memory blocks for time information and second memory blocks for frequency information. The transcription for the utterance is provided as output of an automated speech recognizer.

2.

发明授权
Adaptive audio enhancement for multichannel speech recognition 有权

公开(公告)号：US09886949B2

公开(公告)日：2018-02-06

申请号：US15392122

申请日：2016-12-28

Applicant: Google Inc.

Inventor： Bo Li , Ron J. Weiss , Michiel A. U. Bacchiani , Tara N. Sainath , Kevin William Wilson

IPC: G10L15/00 , G10L15/16 , G10L21/0224 , G10L21/0216 , G10L15/26

CPC classification number: G10L15/16 , G10L15/20 , G10L15/26 , G10L21/0224 , G10L2021/02166

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for neural network adaptive beamforming for multichannel speech recognition are disclosed. In one aspect, a method includes the actions of receiving a first channel of audio data corresponding to an utterance and a second channel of audio data corresponding to the utterance. The actions further include generating a first set of filter parameters for a first filter based on the first channel of audio data and the second channel of audio data and a second set of filter parameters for a second filter based on the first channel of audio data and the second channel of audio data. The actions further include generating a single combined channel of audio data. The actions further include inputting the audio data to a neural network. The actions further include providing a transcription for the utterance.

3.

发明申请
ADAPTIVE AUDIO ENHANCEMENT FOR MULTICHANNEL SPEECH RECOGNITION 有权

公开(公告)号：US20170278513A1

公开(公告)日：2017-09-28

申请号：US15392122

申请日：2016-12-28

Applicant: Google Inc.

Inventor： Bo Li , Ron J. Weiss , Michiel A.U. Bacchiani , Tara N. Sainath , Kevin William Wilson

IPC: G10L15/16 , G10L21/0224

CPC classification number: G10L15/16 , G10L15/20 , G10L15/26 , G10L21/0224 , G10L2021/02166

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for neural network adaptive beamforming for multichannel speech recognition are disclosed. In one aspect, a method includes the actions of receiving a first channel of audio data corresponding to an utterance and a second channel of audio data corresponding to the utterance. The actions further include generating a first set of filter parameters for a first filter based on the first channel of audio data and the second channel of audio data and a second set of filter parameters for a second filter based on the first channel of audio data and the second channel of audio data. The actions further include generating a single combined channel of audio data. The actions further include inputting the audio data to a neural network. The actions further include providing a transcription for the utterance.

Patent Agency Ranking