Patent search ap:("Google LLC") AND inv:"Jilong Wu" Page 1

1.

发明授权
Small footprint multi-channel keyword spotting 有权

公开(公告)号：US12051406B2

公开(公告)日：2024-07-30

申请号：US17757260

申请日：2020-01-15

Applicant: Google LLC

Inventor： Jilong Wu , Yiteng Huang

IPC: G10L15/16 , G10L15/08 , G10L15/28 , H04R3/00

CPC classification number: G10L15/16 , G10L15/285 , H04R3/005 , G10L2015/088

Abstract: A method (800) to detect a hotword in a spoken utterance (120) includes receiving a sequence of input frames (210) characterizing streaming multi-channel audio (118). Each channel (119) of the streaming multi-channel audio includes respective audio features (510) captured by a separate dedicated microphone (107). For each input frame, the method includes processing, using a three-dimensional (3D) single value decomposition filter (SVDF) input layer (302) of a memorized neural network (300), the respective audio features of each channel in parallel and generating a corresponding multi-channel audio feature representation (420) based on a concatenation of the respective audio features (344). The method also includes generating, using sequentially-stacked SVDF layers (350), a probability score (360) indicating a presence of a hotword in the audio. The method also includes determining whether the probability score satisfies a threshold and, when satisfied, initiating a wake-up process on a user device (102).

2.

发明公开
Small Footprint Multi-Channel Keyword Spotting 审中-公开

公开(公告)号：US20240347051A1

公开(公告)日：2024-10-17

申请号：US18754462

申请日：2024-06-26

Applicant: Google LLC

Inventor： Jilong Wu , Yiteng Huang

IPC: G10L15/16 , G10L15/08 , G10L15/28 , H04R3/00

CPC classification number: G10L15/16 , G10L15/285 , H04R3/005 , G10L2015/088

Abstract: A method to detect a hotword in a spoken utterance includes receiving a sequence of input frames characterizing streaming multi-channel audio. Each channel of the streaming multi-channel audio includes respective audio features captured by a separate dedicated microphone. For each input frame, the method includes processing, using a three-dimensional (3D) single value decomposition filter (SVDF) input layer of a memorized neural network, the respective audio features of each channel in parallel and generating a corresponding multi-channel audio feature representation based on a concatenation of the respective audio features. The method also includes generating, using sequentially-stacked SVDF layers, a probability score indicating a presence of a hotword in the audio. The method also includes determining whether the probability score satisfies a threshold and, when satisfied, initiating a wake-up process on a user device.

3.

发明申请
Small Footprint Multi-Channel Keyword Spotting 有权

公开(公告)号：US20230022800A1

公开(公告)日：2023-01-26

申请号：US17757260

申请日：2020-01-15

Applicant: Google LLC

Inventor： Jilong Wu , Yiteng Huang

IPC: G10L15/16 , H04R3/00 , G10L15/28

Abstract: A method (800) to detect a hotword in a spoken utterance (120) includes receiving a sequence of input frames (210) characterizing streaming multi-channel audio (118). Each channel (119) of the streaming multi-channel audio includes respective audio features (510) captured by a separate dedicated microphone (107). For each input frame, the method includes processing, using a three-dimensional (3D) single value decomposition filter (SVDF) input layer (302) of a memorized neural network (300), the respective audio features of each channel in parallel and generating a corresponding multi-channel audio feature representation (420) based on a concatenation of the respective audio features (344). The method also includes generating, using sequentially-stacked SVDF layers (350), a probability score (360) indicating a presence of a hotword in the audio. The method also includes determining whether the probability score satisfies a threshold and, when satisfied, initiating a wake-up process on a user device (102).

Patent Agency Ranking