Patent search ap:("Google LLC") AND inv:"Joseph Caroselli" Page 1

1.

发明授权
Adaptive multichannel dereverberation for automatic speech recognition 有权

公开(公告)号：US11699453B2

公开(公告)日：2023-07-11

申请号：US17005823

申请日：2020-08-28

Applicant: Google LLC

Inventor： Joseph Caroselli , Arun Narayanan , Izhak Shafran , Richard Rose

IPC: G10L21/00 , G10L21/0208 , G10L15/20 , G10L15/22 , G10L15/065 , G06F3/16 , G06N3/02 , G06F17/14 , G10L15/06 , G10L21/0216

CPC classification number: G10L21/0208 , G06F3/167 , G06F17/142 , G06N3/02 , G10L15/063 , G10L15/065 , G10L15/20 , G10L15/22 , G10L2015/223 , G10L2021/02082 , G10L2021/02166

Abstract: Utilizing an adaptive multichannel technique to mitigate reverberation present in received audio signals, prior to providing corresponding audio data to one or more additional component(s), such as automatic speech recognition (ASR) components. Implementations disclosed herein are “adaptive”, in that they utilize a filter, in the reverberation mitigation, that is online, causal and varies depending on characteristics of the input. Implementations disclosed herein are “multichannel”, in that a corresponding audio signal is received from each of multiple audio transducers (also referred to herein as “microphones”) of a client device, and the multiple audio signals (e.g., frequency domain representations thereof) are utilized in updating of the filter—and dereverberation occurs for audio data corresponding to each of the audio signals (e.g., frequency domain representations thereof) prior to the audio data being provided to ASR component(s) and/or other component(s).

2.

发明公开
Microphone Array Configuration Invariant, Streaming, Multichannel Neural Enhancement Frontend for Automatic Speech Recognition 审中-公开

公开(公告)号：US20230298612A1

公开(公告)日：2023-09-21

申请号：US18171411

申请日：2023-02-20

Applicant: Google LLC

Inventor： Joseph Caroselli , Arun Narayanan , Tom O'malley

IPC: G10L21/0232 , G10L25/30 , H04S3/00 , G10L15/22 , G10L15/06 , G10L15/16 , G10L25/18

CPC classification number: G10L21/0232 , G10L25/30 , H04S3/008 , G10L15/22 , G10L15/063 , G10L15/16 , G10L25/18 , H04S2400/01 , G10L2021/02082

Abstract: A multichannel neural frontend speech enhancement model for speech recognition includes a speech cleaner, a stack of self-attention blocks each having a multi-headed self attention mechanism, and a masking layer. The speech cleaner receives, as input, a multichannel noisy input signal and a multichannel contextual noise signal, and generates, as output, a single channel cleaned input signal. The stack of self-attention blocks receives, as input, at an initial block of the stack of self-attention blocks, a stacked input including the single channel cleaned input signal and a single channel noisy input signal, and generates, as output, from a final block of the stack of self-attention blocks, an un-masked output. The masking layer receives, as input, the single channel noisy input signal and the un-masked output, and generates, as output, enhanced input speech features corresponding to a target utterance.

3.

发明授权
Adaptive multichannel dereverberation for automatic speech recognition 有权

公开(公告)号：US10762914B2

公开(公告)日：2020-09-01

申请号：US16032996

申请日：2018-07-11

Applicant: Google LLC

Inventor： Joseph Caroselli , Arun Narayanan , Izhak Shafran , Richard Rose

IPC: G10L21/00 , G10L21/0208 , G10L15/20 , G10L15/22 , G10L15/065 , G06F3/16 , G06N3/02 , G06F17/14 , G10L15/06 , G10L21/0216

Abstract: Utilizing an adaptive multichannel technique to mitigate reverberation present in received audio signals, prior to providing corresponding audio data to one or more additional component(s), such as automatic speech recognition (ASR) components. Implementations disclosed herein are “adaptive”, in that they utilize a filter, in the reverberation mitigation, that is online, causal and varies depending on characteristics of the input. Implementations disclosed herein are “multichannel”, in that a corresponding audio signal is received from each of multiple audio transducers (also referred to herein as “microphones”) of a client device, and the multiple audio signals (e.g., frequency domain representations thereof) are utilized in updating of the filter—and dereverberation occurs for audio data corresponding to each of the audio signals (e.g., frequency domain representations thereof) prior to the audio data being provided to ASR component(s) and/or other component(s).

4.

发明申请
ADAPTIVE MULTICHANNEL DEREVERBERATION FOR AUTOMATIC SPEECH RECOGNITION 审中-公开

公开(公告)号：US20190272840A1

公开(公告)日：2019-09-05

申请号：US16032996

申请日：2018-07-11

Applicant: Google LLC

Inventor： Joseph Caroselli , Arun Narayanan , Izhak Shafran , Richard Rose

IPC: G10L21/0208 , G10L15/20 , G10L15/22 , G10L15/065 , G10L15/06 , G06F3/16 , G06N3/02 , G06F17/14

Abstract: Utilizing an adaptive multichannel technique to mitigate reverberation present in received audio signals, prior to providing corresponding audio data to one or more additional component(s), such as automatic speech recognition (ASR) components. Implementations disclosed herein are “adaptive”, in that they utilize a filter, in the reverberation mitigation, that is online, causal and varies depending on characteristics of the input. Implementations disclosed herein are “multichannel”, in that a corresponding audio signal is received from each of multiple audio transducers (also referred to herein as “microphones”) of a client device, and the multiple audio signals (e.g., frequency domain representations thereof) are utilized in updating of the filter—and dereverberation occurs for audio data corresponding to each of the audio signals (e.g., frequency domain representations thereof) prior to the audio data being provided to ASR component(s) and/or other component(s).

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification