Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Gengshen Fu"

1.

发明公开
MULTIPLE WAKEWORD DETECTION 审中-公开

公开(公告)号：US20230186902A1

公开(公告)日：2023-06-15

申请号：US17547547

申请日：2021-12-10

Applicant: Amazon Technologies, Inc.

Inventor： Gengshen Fu , Huitian Lei , Sai Kiran Venkata Subramanya Rupanagudi , Yuriy Mishchenko , Cody Jacques

IPC: G10L15/16 , G10L15/22

CPC classification number: G10L15/16 , G10L15/22 , G10L2015/088

Abstract: A device is configured to detect multiple different wakewords. A device may operate a joint encoder that operates on audio data to determine encoded audio data. The device may operate multiple different decoders which process the encoded audio data to determine if a wakeword is detected. Each decoder may correspond to a different wakeword. The decoders may use fewer computing resources than the joint encoder, allowing for the device to more easily perform multiple wakeword processing. Enabling/disabling wakeword(s) may involve the reconfiguring of a wakeword detector to add/remove data for respective decoder(s).

2.

发明授权
Wakeword detection using multi-word model 有权

公开(公告)号：US11308939B1

公开(公告)日：2022-04-19

申请号：US16140737

申请日：2018-09-25

Applicant: Amazon Technologies, Inc.

Inventor： Yixin Gao , Ming Sun , Varun Nagaraja , Gengshen Fu , Chao Wang , Shiv Naga Prasad Vitaladevuni

IPC: G10L15/14 , G10L15/22 , G06F3/16 , G10L15/08

Abstract: A system and method performs wakeword detection and automatic speech recognition using the same acoustic model. A mapping engine maps phones/senones output by the acoustic model to phones/senones corresponding to the wakeword. A hidden Markov model (HMM) may determine that the wakeword is present in audio data; the HMM may have multiple paths for multiple wakewords or may have multiple models. Once the wakeword is detected, ASR is performed using the acoustic model.

3.

发明授权
Dynamic wakeword detection 有权

公开(公告)号：US10777189B1

公开(公告)日：2020-09-15

申请号：US15832259

申请日：2017-12-05

Applicant: Amazon Technologies, Inc.

Inventor： Gengshen Fu , Shiv Naga Prasad Vitaladevuni , Paul McIntyre , Shuang Wu

IPC: G10L15/00 , G10L15/18 , G10L15/30 , G10L15/22 , G10L15/08

Abstract: Techniques for using a dynamic wakeword detection threshold are described. A device detects a wakeword in audio data using a first wakeword detection threshold value. Thereafter, the device receives audio including speech. If the device receives the audio within a predetermined duration of time after detecting the previous wakeword, the device attempts to detect a wakeword in second audio data, corresponding to the audio including the speech, using a second, lower wakeword detection threshold value.

4.

发明授权
Preemptive wakeword detection 有权

公开(公告)号：US12190875B1

公开(公告)日：2025-01-07

申请号：US17490572

申请日：2021-09-30

Applicant: Amazon Technologies, Inc.

Inventor： Eli Joshua Fidler , Aaron Challenner , Zoe Adams , Sree Hari Krishnan Parthasarathi , Gengshen Fu

IPC: G10L15/00 , G10L15/02 , G10L15/22 , G10L15/08 , G10L15/187

Abstract: Systems and methods for preemptive wakeword detection are disclosed. For example, a first part of a wakeword is detected from audio data representing a user utterance. When this occurs, on-device speech processing is initiated prior to when the entire wakeword is detected. When the entire wakeword is detected, results from the on-device speech processing and/or the audio data is sent to a speech processing system to determine a responsive action to be performed by the device. When the entire wakeword is not detected, on-device processing is canceled and the device refrains from sending the audio data to the speech processing system.

5.

发明申请
AUDIO DETECTION 有权

公开(公告)号：US20240412728A1

公开(公告)日：2024-12-12

申请号：US18333041

申请日：2023-06-12

Applicant: Amazon Technologies, Inc.

Inventor： Michael Thomas Peterson , Gengshen Fu , Aaron Challenner , Rong Chen , Cody Jacques , Stefan M Bradstreet

IPC: G10L15/22 , G10L15/16

Abstract: A device is configured to detect multiple different wakewords. A device may operate a joint encoder that operates on audio data to determine encoded audio data. The device may operate multiple different decoders which process the encoded audio data to determine if a wakeword is detected. Each decoder may correspond to a different wakeword. The decoders may use fewer computing resources than the joint encoder, allowing for the device to more easily perform multiple wakeword processing. Enabling/disabling wakeword(s) may involve the reconfiguring of a wakeword detector to add/remove data for respective decoder(s). Specific decoders may be activated/deactivated depending on device context, thereby efficiently managing device resources.

6.

发明申请
PREEMPTIVE WAKEWORD DETECTION 有权

公开(公告)号：US20250149036A1

公开(公告)日：2025-05-08

申请号：US18966827

申请日：2024-12-03

Applicant: Amazon Technologies, Inc.

Inventor： Eli Joshua Fidler , Aaron Challenner , Zoe Adams , Sree Hari Krishnan Parthasarathi , Gengshen Fu

IPC: G10L15/22 , G10L15/02 , G10L15/08 , G10L15/187

Abstract: Systems and methods for preemptive wakeword detection are disclosed. For example, a first part of a wakeword is detected from audio data representing a user utterance. When this occurs, on-device speech processing is initiated prior to when the entire wakeword is detected. When the entire wakeword is detected, results from the on-device speech processing and/or the audio data is sent to a speech processing system to determine a responsive action to be performed by the device. When the entire wakeword is not detected, on-device processing is canceled and the device refrains from sending the audio data to the speech processing system.

7.

发明授权
Dynamic wakeword detection 有权

公开(公告)号：US10510340B1

公开(公告)日：2019-12-17

申请号：US15832331

申请日：2017-12-05

Applicant: Amazon Technologies, Inc.

Inventor： Gengshen Fu , Shiv Naga Prasad Vitaladevuni , Paul McIntyre , Shuang Wu

IPC: G10L15/26 , G10L15/18 , G10L15/22 , G10L15/16 , G10L15/08

Abstract: Techniques for using a dynamic wakeword detection threshold are described. A server(s) may receive audio data corresponding to an utterance from a device in response to the device detecting a wakeword using a wakeword detection threshold. The server(s) may then determine the device should use a lower wakeword detection threshold for a duration of time. In addition to sending the device output data responsive to the utterance, the server(s) may send the device an instruction to use the lower wakeword detection threshold for the duration of time. Alternatively, the server(s) may train a machine learning model to determine when the device should use a lower wakeword detection threshold. The server(s) may send the trained machine learned model to the device for use at runtime.

8.

发明授权
Dynamic wakeword detection 有权

公开(公告)号：US11699433B2

公开(公告)日：2023-07-11

申请号：US16936952

申请日：2020-07-23

Applicant: Amazon Technologies, Inc.

Inventor： Gengshen Fu , Shiv Naga Prasad Vitaladevuni , Paul McIntyre , Shuang Wu

IPC: G10L15/18 , G10L15/30 , G10L15/22 , G10L15/08

CPC classification number: G10L15/18 , G10L15/22 , G10L15/30 , G10L2015/088

Abstract: Techniques for using a dynamic wakeword detection threshold are described. A device detects a wakeword in audio data using a first wakeword detection threshold value. Thereafter, the device receives audio including speech. If the device receives the audio within a predetermined duration of time after detecting the previous wakeword, the device attempts to detect a wakeword in second audio data, corresponding to the audio including the speech, using a second, lower wakeword detection threshold value.

9.

发明授权
Speech processing using a recurrent neural network 有权

公开(公告)号：US11205420B1

公开(公告)日：2021-12-21

申请号：US16436562

申请日：2019-06-10

Applicant: Amazon Technologies, Inc.

Inventor： Gengshen Fu , Thibaud Senechal , Shiv Naga Prasad Vitaladevuni , Michael J. Rodehorst , Varun K. Nagaraja

IPC: G10L15/16 , G10L15/22 , G10L15/06 , G06N3/04 , G06N3/02 , G10L25/30 , G10L15/08

Abstract: A system and method performs wakeword detection using a neural network model that includes a recurrent neural network (RNN) for processing variable-length wakewords. To prevent the model from being influenced by non-wakeword speech, multiple instances of the model are created to process audio data, and each instance is configured to use weights determined by training data. The model may instead or in addition be used to process the audio data only when a likelihood that the audio data corresponds to the wakeword is greater than a threshold. The model may process the audio data as represented by groups of acoustic feature vectors; computations for feature vectors common to different groups may be re-used.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification