Multichannel raw-waveform neural networks

Invention Grant

US10339921B2 Multichannel raw-waveform neural networks 有权

Please log in to see more content

Patent Title: Multichannel raw-waveform neural networks
Application No.: US14987146

Application Date: 2016-01-04
Publication No.: US10339921B2

Publication Date: 2019-07-02
Inventor: Tara N. Sainath , Ron J. Weiss , Kevin William Wilson
Applicant: Google LLC
Applicant Address: US CA Mountain View
Assignee: Google LLC
Current Assignee: Google LLC
Current Assignee Address: US CA Mountain View
Agency: Fish & Richardson P.C.
Main IPC: G10L15/00
IPC: G10L15/00 ; G10L15/16 ; G10L15/34 ; G06N3/04 ; G06N3/08 ; G10L15/20 ; G10L21/0208

Multichannel raw-waveform neural networks

Abstract:

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using neural networks. One of the methods includes receiving, by a neural network in a speech recognition system, first data representing a first raw audio signal and second data representing a second raw audio signal, the first raw audio signal and the second raw audio signal for the same period of time, generating, by a spatial filtering convolutional layer in the neural network, a spatial filtered output the first data and the second data, generating, by a spectral filtering convolutional layer in the neural network, a spectral filtered output using the spatial filtered output, and processing, by one or more additional layers in the neural network, the spectral filtered output to predict sub-word units encoded in both the first raw audio signal and the second raw audio signal.

Public/Granted literature

US20170092265A1 MULTICHANNEL RAW-WAVEFORM NEURAL NETWORKS Public/Granted day:2017-03-30

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）