Patent search ap:("Google LLC") AND inv:"Yiteng Huang" Page 1

1.

发明申请
CONTEXT AWARE BEAMFORMING OF AUDIO DATA 有权

公开(公告)号：US20220319498A1

公开(公告)日：2022-10-06

申请号：US17221220

申请日：2021-04-02

Applicant: Google LLC

Inventor： Joseph Caroselli, JR. , Yiteng Huang , Arun Narayanan

IPC: G10L15/08 , G10L21/0216 , G10L15/05 , G06N20/00

Abstract: Implementations disclosed herein are directed to initializing and utilizing a beamformer in processing of audio data received at a computing device. The computing device can: receive audio data that captures a spoken utterance of a user, determine that a first audio data segment of the audio data includes one or more particular words or phrases; obtain a preceding audio data segment that precedes the first audio data segment; estimate a spatial correlation matrix based on the first audio data segment and based on the preceding audio data segment; initialize the beamformer based on the estimated spatial correlation matrix; and cause the initialized beamformer to be utilized in processing of at least a second audio data segment of the audio data. Additionally, or alternatively, the computing device can transmit the spatial correlation matrix to server(s), and the server(s) can transmit the initialized beamformer back to the computing device.

2.

发明公开
Small Footprint Multi-Channel Keyword Spotting 审中-公开

公开(公告)号：US20240347051A1

公开(公告)日：2024-10-17

申请号：US18754462

申请日：2024-06-26

Applicant: Google LLC

Inventor： Jilong Wu , Yiteng Huang

IPC: G10L15/16 , G10L15/08 , G10L15/28 , H04R3/00

CPC classification number: G10L15/16 , G10L15/285 , H04R3/005 , G10L2015/088

Abstract: A method to detect a hotword in a spoken utterance includes receiving a sequence of input frames characterizing streaming multi-channel audio. Each channel of the streaming multi-channel audio includes respective audio features captured by a separate dedicated microphone. For each input frame, the method includes processing, using a three-dimensional (3D) single value decomposition filter (SVDF) input layer of a memorized neural network, the respective audio features of each channel in parallel and generating a corresponding multi-channel audio feature representation based on a concatenation of the respective audio features. The method also includes generating, using sequentially-stacked SVDF layers, a probability score indicating a presence of a hotword in the audio. The method also includes determining whether the probability score satisfies a threshold and, when satisfied, initiating a wake-up process on a user device.

3.

发明授权
Context aware beamforming of audio data 有权

公开(公告)号：US11798533B2

公开(公告)日：2023-10-24

申请号：US17221220

申请日：2021-04-02

Applicant: Google LLC

Inventor： Joseph Caroselli, Jr. , Yiteng Huang , Arun Narayanan

IPC: G10L15/08 , G10L21/0216 , G06N20/00 , G10L15/05

CPC classification number: G10L15/083 , G06N20/00 , G10L15/05 , G10L21/0216 , G10L2015/088 , G10L2021/02166

Abstract: Implementations disclosed herein are directed to initializing and utilizing a beamformer in processing of audio data received at a computing device. The computing device can: receive audio data that captures a spoken utterance of a user, determine that a first audio data segment of the audio data includes one or more particular words or phrases; obtain a preceding audio data segment that precedes the first audio data segment; estimate a spatial correlation matrix based on the first audio data segment and based on the preceding audio data segment; initialize the beamformer based on the estimated spatial correlation matrix; and cause the initialized beamformer to be utilized in processing of at least a second audio data segment of the audio data. Additionally, or alternatively, the computing device can transmit the spatial correlation matrix to server(s), and the server(s) can transmit the initialized beamformer back to the computing device.

4.

发明授权
Selective adaptation and utilization of noise reduction technique in invocation phrase detection 有权

公开(公告)号：US11984117B2

公开(公告)日：2024-05-14

申请号：US17886726

申请日：2022-08-12

Applicant: Google LLC

Inventor： Christopher Hughes , Yiteng Huang , Turaj Zakizadeh Shabestary , Taylor Applebaum

IPC: G10L15/20 , G10L15/02 , G10L15/08 , G10L15/22 , G10L21/0232 , G10L25/84 , G10L21/0216

CPC classification number: G10L15/20 , G10L15/02 , G10L15/08 , G10L15/22 , G10L21/0232 , G10L25/84 , G10L2015/025 , G10L2015/088 , G10L2015/223 , G10L2021/02166

Abstract: Techniques are described for selectively adapting and/or selectively utilizing a noise reduction technique in detection of one or more features of a stream of audio data frames. For example, various techniques are directed to selectively adapting and/or utilizing a noise reduction technique in detection of an invocation phrase in a stream of audio data frames, detection of voice characteristics in a stream of audio data frames (e.g., for speaker identification), etc. Utilization of described techniques can result in more robust and/or more accurate detections of features of a stream of audio data frames in various situations, such as in environments with strong background noise. In various implementations, described techniques are implemented in combination with an automated assistant, and feature(s) detected utilizing techniques described herein are utilized to adapt the functionality of the automated assistant.

5.

发明授权
Selective adaptation and utilization of noise reduction technique in invocation phrase detection 有权

公开(公告)号：US11417324B2

公开(公告)日：2022-08-16

申请号：US16886139

申请日：2020-05-28

Applicant: Google LLC

Inventor： Christopher Hughes , Yiteng Huang , Turaj Zakizadeh Shabestary , Taylor Applebaum

IPC: G10L15/20 , G10L15/02 , G10L15/08 , G10L15/22 , G10L21/0232 , G10L25/84 , G10L21/0216

Abstract: Techniques are described for selectively adapting and/or selectively utilizing a noise reduction technique in detection of one or more features of a stream of audio data frames. For example, various techniques are directed to selectively adapting and/or utilizing a noise reduction technique in detection of an invocation phrase in a stream of audio data frames, detection of voice characteristics in a stream of audio data frames (e.g., for speaker identification), etc. Utilization of described techniques can result in more robust and/or more accurate detections of features of a stream of audio data frames in various situations, such as in environments with strong background noise. In various implementations, described techniques are implemented in combination with an automated assistant, and feature(s) detected utilizing techniques described herein are utilized to adapt the functionality of the automated assistant.

6.

发明公开
SELECTIVE ADAPTATION AND UTILIZATION OF NOISE REDUCTION TECHNIQUE IN INVOCATION PHRASE DETECTION 审中-公开

公开(公告)号：US20240304187A1

公开(公告)日：2024-09-12

申请号：US18662334

申请日：2024-05-13

Applicant: GOOGLE LLC

Inventor： Christopher Hughes , Yiteng Huang , Turaj Zakizadeh Shabestary , Taylor Applebaum

IPC: G10L15/20 , G10L15/02 , G10L15/08 , G10L15/22 , G10L21/0216 , G10L21/0232 , G10L25/84

CPC classification number: G10L15/20 , G10L15/02 , G10L15/08 , G10L15/22 , G10L21/0232 , G10L25/84 , G10L2015/025 , G10L2015/088 , G10L2015/223 , G10L2021/02166

Abstract: Techniques are described for selectively adapting and/or selectively utilizing a noise reduction technique in detection of one or more features of a stream of audio data frames. For example, various techniques are directed to selectively adapting and/or utilizing a noise reduction technique in detection of an invocation phrase in a stream of audio data frames, detection of voice characteristics in a stream of audio data frames (e.g., for speaker identification), etc. Utilization of described techniques can result in more robust and/or more accurate detections of features of a stream of audio data frames in various situations, such as in environments with strong background noise. In various implementations, described techniques are implemented in combination with an automated assistant, and feature(s) detected utilizing techniques described herein are utilized to adapt the functionality of the automated assistant.

7.

发明授权
Small footprint multi-channel keyword spotting 有权

公开(公告)号：US12051406B2

公开(公告)日：2024-07-30

申请号：US17757260

申请日：2020-01-15

Applicant: Google LLC

Inventor： Jilong Wu , Yiteng Huang

IPC: G10L15/16 , G10L15/08 , G10L15/28 , H04R3/00

CPC classification number: G10L15/16 , G10L15/285 , H04R3/005 , G10L2015/088

Abstract: A method (800) to detect a hotword in a spoken utterance (120) includes receiving a sequence of input frames (210) characterizing streaming multi-channel audio (118). Each channel (119) of the streaming multi-channel audio includes respective audio features (510) captured by a separate dedicated microphone (107). For each input frame, the method includes processing, using a three-dimensional (3D) single value decomposition filter (SVDF) input layer (302) of a memorized neural network (300), the respective audio features of each channel in parallel and generating a corresponding multi-channel audio feature representation (420) based on a concatenation of the respective audio features (344). The method also includes generating, using sequentially-stacked SVDF layers (350), a probability score (360) indicating a presence of a hotword in the audio. The method also includes determining whether the probability score satisfies a threshold and, when satisfied, initiating a wake-up process on a user device (102).

8.

发明公开
Cascade Architecture for Noise-Robust Keyword Spotting 审中-公开

公开(公告)号：US20240242728A1

公开(公告)日：2024-07-18

申请号：US18619608

申请日：2024-03-28

Applicant: Google LLC

Inventor： Yiteng Huang , Alexander H. Gruenstein

IPC: G10L21/0216 , G10L15/08 , G10L15/22

CPC classification number: G10L21/0216 , G10L15/08 , G10L15/22 , G10L2015/088 , G10L2021/02166

Abstract: A method includes receiving, at a first processor of a user device, streaming multi-channel audio captured by an array of microphones, each channel including respective audio features. For each channel, the method also includes processing, by the first processor, using a first stage hotword detector, the respective audio features to determine whether a hotword is detected. When the first stage hotword detector detects the hotword, the method also includes the first processor providing chomped raw audio data to a second processor that processes, using a first noise cleaning algorithm, the chomped raw audio data to generate a clean monophonic audio chomp. The method also includes processing, by the second processor using a second stage hotword detector, the clean monophonic audio chomp to detect the hotword.

9.

发明授权
Selective adaptation and utilization of noise reduction technique in invocation phrase detection 有权

公开(公告)号：US10706842B2

公开(公告)日：2020-07-07

申请号：US16609619

申请日：2019-01-14

Applicant: Google LLC

Inventor： Christopher Hughes , Yiteng Huang , Turaj Zakizadeh Shabestary , Taylor Applebaum

IPC: G10L15/20 , G10L15/02 , G10L15/08 , G10L15/22 , G10L21/0232 , G10L25/84 , G10L21/0216

Abstract: Techniques are described for selectively adapting and/or selectively utilizing a noise reduction technique in detection of one or more features of a stream of audio data frames. Various techniques are directed to selectively adapting and/or utilizing a noise reduction technique in detection of an invocation phrase in a stream of audio data frames, detection of voice characteristics in a stream of audio data frames (e.g., for speaker identification), etc. Utilization of described techniques can result in more robust and/or more accurate detections of features of a stream of audio data frames in various situations, such as in environments with strong background noise. In various implementations, described techniques are implemented in combination with an automated assistant, and feature(s) detected utilizing techniques described herein are utilized to adapt the functionality of the automated assistant.

10.

发明申请
SELECTIVE ADAPTATION AND UTILIZATION OF NOISE REDUCTION TECHNIQUE IN INVOCATION PHRASE DETECTION 审中-公开

公开(公告)号：US20200066263A1

公开(公告)日：2020-02-27

申请号：US16609619

申请日：2019-01-14

Applicant: Google LLC

Inventor： Christopher Hughes , Yiteng Huang , Turaj Zakizadeh Shabestary , Taylor Applebaum

IPC: G10L15/20 , G10L15/02 , G10L25/84 , G10L21/0232 , G10L15/22 , G10L15/08

Abstract: Techniques are described for selectively adapting and/or selectively utilizing a noise reduction technique in detection of one or more features of a stream of audio data frames. For example, various techniques are directed to selectively adapting and/or utilizing a noise reduction technique in detection of an invocation phrase in a stream of audio data frames, detection of voice characteristics in a stream of audio data frames (e.g., for speaker identification), etc. Utilization of described techniques can result in more robust and/or more accurate detections of features of a stream of audio data frames in various situations, such as in environments with strong background noise. In various implementations, described techniques are implemented in combination with an automated assistant, and feature(s) detected utilizing techniques described herein are utilized to adapt the functionality of the automated assistant.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification