-
公开(公告)号:US11711647B1
公开(公告)日:2023-07-25
申请号:US17197929
申请日:2021-03-10
Applicant: Amazon Technologies, Inc.
Inventor: Kuan-Chieh Yen , Daniel Wayne Harris , Carlo Murgia , Taro Kimura
CPC classification number: H04R3/00 , H04R1/1016 , H04R2201/107
Abstract: This disclosure describes techniques for detecting voice commands from a user of an ear-based device. The ear-based device may include an in-ear facing microphone to capture sound emitted in an ear of the user, and an exterior facing microphone to capture sound emitted in an exterior environment of the user. The in-ear microphone may generate an inner audio signal representing the sound emitted in the ear, and the exterior microphone may generate an outer audio signal representing sound from the exterior environment. The ear-based device may compute a ratio of a power of the inner audio signal to the outer audio signal and may compare this ratio to a threshold. If the ratio is larger than the threshold, the ear-based device may detect the voice of the user. Further, the ear-based device may set a value of the threshold based on a level of acoustic seal of the ear-based device.
-
公开(公告)号:US11315581B1
公开(公告)日:2022-04-26
申请号:US16995220
申请日:2020-08-17
Applicant: Amazon Technologies, Inc.
Inventor: Aditya Sharadchandra Joshi , Carlo Murgia , Michael Thomas Peterson
IPC: G10L19/002 , G10L19/02 , H04L65/60 , G10L21/02 , G06F3/16
Abstract: Techniques for encoding audio data with metadata are described. In an example, a device receives audio data corresponding to audio detected by a microphone and receives metadata associated with the audio. The device generates encoded data based at least in part on encoding the audio data with the metadata. The encoding involves replacing a portion of the audio data with the metadata, such that the encoded data includes the metadata and a remaining portion of the audio data. The device sends the encoded data to an audio processing application.
-
公开(公告)号:US11259117B1
公开(公告)日:2022-02-22
申请号:US17036807
申请日:2020-09-29
Applicant: Amazon Technologies, Inc.
Inventor: Kanthasamy Chelliah , Wai Chung Chu , Andreas Schwarz , Carlos Renato Nakagawa , Berkant Tacer , Carlo Murgia
Abstract: A system configured to improve audio processing by performing dereverberation and noise reduction during a communication session. The system may apply a two-channel dereverberation algorithm by calculating coherence-to-diffuse ratio (CDR) values and calculating dereverberation (DER) gain values based on the CDR values. While the device calculates the DER gain values prior to performing acoustic echo cancellation (AEC) processing, the device applies the DER gain values after performing residual echo suppression (RES) processing in order to avoid excessive attenuation of the local speech. To improve output speech quality, the device does not apply the DER gain values for nonreverberant signals, when a signal-to-noise ratio (SNR) value is too low, and/or when far-end talk (e.g., remote speech) is present. Dereverberation processing is further improved by using frequency dependent parameters to calculate the DER gain values and by adjusting other gain values when the DER gain values are applied.
-
公开(公告)号:US10600432B1
公开(公告)日:2020-03-24
申请号:US15471629
申请日:2017-03-28
Applicant: Amazon Technologies, Inc.
Inventor: Wai Chung Chu , Carlo Murgia , Hyeong Cheol Kim
IPC: G10L21/034 , G10L25/84 , G10L21/02 , G10L25/21
Abstract: A system configured to perform power normalization for voice enhancement. The system may identify active intervals corresponding to voice activity and may selectively amplify the active intervals in order to generate output audio data at a near uniform loudness. The system may determine a variable gain for each of the active intervals based on a desired output loudness and a flatness value, which indicates how much a signal envelope is to be modified. For example, a low flatness value corresponds to no modification, with peak active interval values corresponding to the desired output loudness and lower active intervals being lower than the desired output loudness. In contrast, a high flatness value corresponds to extensive modification, with peak active interval values and lower active interval values both corresponding to the desired output loudness. Thus, individual words may share the same peak power level.
-
-
-