-
公开(公告)号:US11762052B1
公开(公告)日:2023-09-19
申请号:US17475888
申请日:2021-09-15
Applicant: Amazon Technologies, Inc.
Inventor: Anshuman Ganguly , Mrudula V. Athi , Spencer Russell , Alexander M. Epstein , Wontak Kim
CPC classification number: G01S3/8083 , G01S5/20 , G06T7/70 , G10L15/22 , G10L2015/223
Abstract: Techniques for improving sound source localization (SSL) are provided. A method for probabilistic SSL using a deep neural network (DNN) may include receiving audio data including a representation of audio such as a wakeword from a microphone array. The audio data may be processed by a DNN to output a plurality of values where each value indicates a probability that the audio originated from a direction corresponding to that value. A sensor may provide computer vision or other data which may be used to inform the plurality of values based on detecting presence of a human or obstacle. A probability that the audio originated from one of the directions of the plurality of directions may be determined based at least in part on the DNN output and the computer vision or other data.
-
公开(公告)号:US12272369B1
公开(公告)日:2025-04-08
申请号:US17578737
申请日:2022-01-19
Applicant: Amazon Technologies, Inc.
Inventor: Amit Singh Chhetri , Mrudula V. Athi , Pradeep Kumar Govindaraju , Rong Hu
IPC: G10L21/0216 , G10L21/0208 , G10L21/10 , G10L25/30
Abstract: A system configured to improve audio processing by performing dereverberation and noise reduction during a communication session. In some examples, the system may include a deep neural network (DNN) configured to perform speech enhancement, which is located after an Acoustic Echo Cancellation (AEC) component. For example, the DNN may process isolated audio data output by the AEC component to jointly mitigate additive noise and reverberation. In other examples, the system may include a DNN configured to perform acoustic interference cancellation, which may jointly mitigate additive noise, reverberation, and residual echo, removing the need to perform residual echo suppression processing. The DNN is configured to process complex-valued spectrograms corresponding to the isolated audio data and/or estimated echo data generated by the AEC component.
-