-
公开(公告)号:US11922949B1
公开(公告)日:2024-03-05
申请号:US16995005
申请日:2020-08-17
Applicant: Amazon Technologies, Inc.
Inventor: Aditya Sharadchandra Joshi , Dibyendu Nandy
CPC classification number: G10L15/34 , G10L2015/223 , G10L2015/225
Abstract: Techniques for improving the power consumption of a device without impacting or with minimal impact to operations of the device are described. In an example, the device includes a processor. The device receives, while the processor is operating in a first power mode, first input data corresponding to first audio detected by a microphone. Based at least in part on the first input data, the device detects a sound event or ambient noise. Based at least in part on a detection of the ambient noise only, the device causes processor to operate in a second power mode in which the processor consumes less power than in the first power mode.
-
公开(公告)号:US12190902B1
公开(公告)日:2025-01-07
申请号:US17672298
申请日:2022-02-15
Applicant: Amazon Technologies, Inc.
Inventor: Aditya Sharadchandra Joshi , Zhouhui Miao
IPC: G10L21/0364 , G10L15/05 , G10L15/08 , G10L15/22 , G10L15/30 , G10L21/034 , G10L25/21
Abstract: A system configured to perform audio processing with adaptive multi-stage output gains. For example, an Audio Front End (AFE) component may generate a first output using a fixed gain value in order to improve device arbitration and a second output using an adaptive gain value in order to improve wakeword detection. A wakeword engine may process the second output to determine that a wakeword is present along with start/end times of the wakeword. The AFE component can use the start/end times to determine an amount of wakeword energy represented in the first output, which is sent to a remote device for device arbitration. The AFE component can also use the start/end times to determine an amount of wakeword energy represented in the second output, which can be used to determine the adaptive gain value that is unique to the device.
-
公开(公告)号:US12033631B1
公开(公告)日:2024-07-09
申请号:US17671724
申请日:2022-02-15
Applicant: Amazon Technologies, Inc.
Inventor: Aditya Sharadchandra Joshi
IPC: G10L15/22 , G10L15/08 , G10L21/0208 , G10L25/06
CPC classification number: G10L15/22 , G10L15/08 , G10L25/06 , G10L2015/088 , G10L2015/223 , G10L2021/02082
Abstract: A system configured to perform self-trigger prevention to avoid a device waking itself up when a wakeword is output by the device's own output audio. For example, during active playback the device may perform double-talk detection and suppress wakewords or other device-directed utterances when near-end speech is not present. To detect whether near-end speech is present, an Audio Front End (AFE) of the device may perform echo cancellation and generate correlation data indicating an amount of correlation between an output of the echo canceller and an estimated reference signal. When the correlation is high in certain frequency ranges, near-end speech is not present and the device may suppress the utterance. When the correlation is low, indicating that near-end speech could be present, the device does not suppress the utterance and sends the utterance to a remote system for speech processing.
-
公开(公告)号:US20240363112A1
公开(公告)日:2024-10-31
申请号:US18765717
申请日:2024-07-08
Applicant: Amazon Technologies, Inc.
Inventor: Aditya Sharadchandra Joshi
IPC: G10L15/22 , G10L15/08 , G10L21/0208 , G10L25/06
CPC classification number: G10L15/22 , G10L15/08 , G10L25/06 , G10L2015/088 , G10L2015/223 , G10L2021/02082
Abstract: A system configured to perform self-trigger prevention to avoid a device waking itself up when a wakeword is output by the device's own output audio. For example, during active playback the device may perform double-talk detection and suppress wakewords or other device-directed utterances when near-end speech is not present. To detect whether near-end speech is present, an Audio Front End (AFE) of the device may perform echo cancellation and generate correlation data indicating an amount of correlation between an output of the echo canceller and an estimated reference signal. When the correlation is high in certain frequency ranges, near-end speech is not present and the device may suppress the utterance. When the correlation is low, indicating that near-end speech could be present, the device does not suppress the utterance and sends the utterance to a remote system for speech processing.
-
公开(公告)号:US11942100B1
公开(公告)日:2024-03-26
申请号:US17713084
申请日:2022-04-04
Applicant: Amazon Technologies, Inc.
Inventor: Aditya Sharadchandra Joshi , Carlo Murgia , Michael Thomas Peterson
CPC classification number: G10L19/02 , G06F3/16 , G06F3/165 , G10L19/002 , G10L21/02 , H04L65/70 , H04L65/75
Abstract: Techniques for encoding audio data with metadata are described. In an example, a device receives audio data corresponding to audio detected by a microphone and receives metadata associated with the audio. The device generates encoded data based at least in part on encoding the audio data with the metadata. The encoding involves replacing a portion of the audio data with the metadata, such that the encoded data includes the metadata and a remaining portion of the audio data. The device sends the encoded data to an audio processing application.
-
公开(公告)号:US11315581B1
公开(公告)日:2022-04-26
申请号:US16995220
申请日:2020-08-17
Applicant: Amazon Technologies, Inc.
Inventor: Aditya Sharadchandra Joshi , Carlo Murgia , Michael Thomas Peterson
IPC: G10L19/002 , G10L19/02 , H04L65/60 , G10L21/02 , G06F3/16
Abstract: Techniques for encoding audio data with metadata are described. In an example, a device receives audio data corresponding to audio detected by a microphone and receives metadata associated with the audio. The device generates encoded data based at least in part on encoding the audio data with the metadata. The encoding involves replacing a portion of the audio data with the metadata, such that the encoded data includes the metadata and a remaining portion of the audio data. The device sends the encoded data to an audio processing application.
-
-
-
-
-