-
公开(公告)号:US09967661B1
公开(公告)日:2018-05-08
申请号:US15490045
申请日:2017-04-18
Applicant: Amazon Technologies, Inc.
Inventor: Philip Ryan Hilmes , Robert Ayrapetian
IPC: H04R3/00 , H04R3/02 , G10L21/0216 , G10L21/0208
CPC classification number: H04R3/005 , G10K11/002 , G10K2210/3028 , G10K2210/3044 , G10K2210/3046 , G10K2210/505 , G10K2210/509 , G10L21/0272 , G10L2015/223 , G10L2021/02082 , G10L2021/02166 , H04M9/082 , H04R3/02 , H04R2203/12 , H04R2430/25
Abstract: An echo cancellation system performs audio beamforming to separate audio input into multiple directions (e.g., target signals) and generates multiple audio outputs using two acoustic echo cancellation (AEC) circuits. A first AEC removes a playback reference signal (generated from a signal sent a loudspeaker) to isolate speech included in the target signals. A second AEC removes an adaptive reference signal (generated from microphone inputs corresponding to audio received from the loudspeaker) to isolate speech included in the target signals. A beam selector receives the multiple audio outputs and selects the first AEC or the second AEC based on a linearity of the system. When linear (e.g., no distortion or variable delay between microphone input and playback signal), the beam selector selects an output from the first AEC based on signal to noise (SNR) ratios. When nonlinear, the beam selector selects an output from the second AEC.
-
公开(公告)号:US09918163B1
公开(公告)日:2018-03-13
申请号:US15341520
申请日:2016-11-02
Applicant: Amazon Technologies, Inc.
Inventor: Robert Ayrapetian , Philip Ryan Hilmes
CPC classification number: H04R3/02 , H04R1/1008 , H04R1/1083 , H04R2420/07 , H04S3/00
Abstract: An echo cancellation system that detects and compensations for differences in sample rates between the echo cancellation system and a set of wireless speakers based on a frequency-domain analysis of estimated impulse response coefficients. The system tracks the real and imaginary number components of the coefficients, and determines a “rotation” of the coefficients over time caused by a frequency offset between the audio sent to the speakers and the audio received from a microphone. Based on the rotation, samples of the audio are added or dropped when echo cancellation is performed, compensating for the frequency offset.
-
公开(公告)号:US09818425B1
公开(公告)日:2017-11-14
申请号:US15185799
申请日:2016-06-17
Applicant: Amazon Technologies, Inc.
Inventor: Robert Ayrapetian , Philip Ryan Hilmes , Wai Chung Chu , Hyeong Cheol Kim , Yuwen Su
IPC: G10L21/0224 , G10L15/30 , G10L25/84 , G10L21/0208 , G10L21/0216 , G10L15/22
CPC classification number: G10L21/0224 , G10L15/30 , G10L2015/223 , G10L2021/02082 , G10L2021/02166
Abstract: An echo cancellation system that generates multiple output paths, enabling Automatic Speech Recognition (ASR) processing in parallel with voice communication. For single direction AEC (e.g., ASR processing), the system prioritizes speech from a single user and ignores other speech by selecting a single directional output from a plurality of directional outputs as a first output path. For multi-directional AEC (e.g., voice communication), the system includes all speech by combining the plurality of directional outputs as a second output path. The system may use a weighted sum technique, such that each directional output is represented in the combined output based on a corresponding signal metric, or an equal weighting technique, such that a first group of directional outputs having a higher signal metric may be equally weighted using a first weight while a second group of directional outputs having a lower signal metric may be equally weighted using a second weight.
-
公开(公告)号:US09659555B1
公开(公告)日:2017-05-23
申请号:US15019129
申请日:2016-02-09
Applicant: Amazon Technologies, Inc.
Inventor: Philip Ryan Hilmes , Robert Ayrapetian
IPC: G10K11/00 , G10L21/0272 , G10L21/0208 , G10L15/22
CPC classification number: H04R3/005 , G10K11/002 , G10K2210/3028 , G10K2210/3044 , G10K2210/3046 , G10K2210/505 , G10K2210/509 , G10L21/0272 , G10L2015/223 , G10L2021/02082 , G10L2021/02166 , H04M9/082 , H04R3/02 , H04R2203/12 , H04R2430/25
Abstract: An echo cancellation system performs audio beamforming to separate audio input into multiple directions (e.g., target signals) and generates multiple audio outputs using two acoustic echo cancellation (AEC) circuits. A first AEC removes a playback reference signal (generated from a signal sent a loudspeaker) to isolate speech included in the target signals. A second AEC removes an adaptive reference signal (generated from microphone inputs corresponding to audio received from the loudspeaker) to isolate speech included in the target signals. A beam selector receives the multiple audio outputs and selects the first AEC or the second AEC based on a linearity of the system. When linear (e.g., no distortion or variable delay between microphone input and playback signal), the beam selector selects an output from the first AEC based on signal to noise (SNR) ratios. When nonlinear, the beam selector selects an output from the second AEC.
-
公开(公告)号:US09390723B1
公开(公告)日:2016-07-12
申请号:US14568033
申请日:2014-12-11
Applicant: Amazon Technologies, Inc.
Inventor: John Walter McDonough, Jr. , Wai Chung Chu , Amit Singh Chhetri , Robert Ayrapetian
IPC: H04R3/00 , G10L21/02 , G10K11/175
CPC classification number: G10K11/175 , G10L21/0208 , G10L21/0232 , G10L2021/02082
Abstract: Features are disclosed for performing efficient dereverberation of speech signals captured with single- and multi-channel sensors in networked audio systems. Such features could be used in applications requiring automatic recognition of speech captured with sensors. Dereverberation is performed in the sub-band domain, and hence provides improved dereverberation performance in terms of signal quality, algorithmic delay, computational efficiency, and speed of convergence.
Abstract translation: 公开了用于对网络音频系统中的单通道和多通道传感器捕获的语音信号进行有效的去混响的特征。 这些特征可以用于需要用传感器捕获的语音自动识别的应用中。 在子带域中执行混频,从而在信号质量,算法延迟,计算效率和收敛速度方面提供改进的去混响性能。
-
公开(公告)号:US11792570B1
公开(公告)日:2023-10-17
申请号:US17470035
申请日:2021-09-09
Applicant: Amazon Technologies, Inc.
Inventor: Pradeep Kumar Govindaraju , Robert Ayrapetian
IPC: H04R3/00 , H04R3/04 , G10L21/0216 , H04R5/04 , G10L21/0208
CPC classification number: H04R3/005 , G10L21/0216 , H04R3/04 , H04R5/04 , G10L2021/02082 , G10L2021/02166
Abstract: Techniques for improving microphone noise suppression are provided. As wind noise may disproportionately impact a subset of microphones, a method for processing audio data using two adaptive reference algorithm (ARA) paths in parallel is provided. For example, first ARA processing performs noise cancellation using all microphones, while second ARA processing performs noise cancellation using only a portion of the microphones. As the first ARA processing and the second ARA processing are performed in parallel, beam merging can be performed using beams from the first ARA, the second ARA, and/or a combination of each. In addition, beam merging can be performed using beam sections instead of individual beams to further improve performance and reduce attenuation to speech.
-
公开(公告)号:US11277685B1
公开(公告)日:2022-03-15
申请号:US16180890
申请日:2018-11-05
Applicant: Amazon Technologies, Inc.
Inventor: Robert Ayrapetian , Philip Ryan Hilmes , Mohamed Mansour , Carlo Murgia
IPC: G10L21/02 , H04R3/00 , H04R5/04 , H04R5/027 , G10L21/0224 , G06F3/16 , G10L21/0272 , G10L21/0208 , G10L21/0216 , G10L25/93 , G10L25/51 , H03H21/00 , G10L25/78
Abstract: Techniques for improving adaptive interference cancellation (AIC) using cascaded AIC algorithms are described. To improve an accuracy of detecting speech, a device may perform a first stage of AIC to generate isolated audio data and may generate speech mask data indicating time windows when speech is detected in the isolated audio data. Based on the speech mask data, the device may perform second AIC to generate output audio data, with adaptation of the adaptive filter enabled when the speech is not detected and disabled when the speech is detected. Thus, the first AIC improves the accuracy with which the device detects that speech is present and the second AIC reduces distortion in the output audio data by not updating filter coefficient values when the speech is present. The first AIC may use playback audio data, microphone audio data or beamformed audio data as reference signals.
-
公开(公告)号:US10937418B1
公开(公告)日:2021-03-02
申请号:US16240294
申请日:2019-01-04
Applicant: Amazon Technologies, Inc.
Inventor: Navin Chatlani , Krishna Kamath Koteshwara , Trausti Thor Kristjansson , Inseok Heo , Robert Ayrapetian
IPC: G10L15/20 , G10L21/0232 , G10L15/22 , G10L21/0208
Abstract: A system configured to improve echo cancellation for nonlinear systems. The system generate reference audio data by isolating portions of microphone audio data that correspond to playback audio data. For example, the system may determine a correlation between the playback audio data and the microphone audio data in individual time-frequency bands in a frequency domain. In some examples, the system may substitute microphone audio data associated with output audio for the playback audio data. The system may generate the reference audio data based on portions of the microphone audio data that have a strong correlation with the playback audio data. The system may generate the reference audio data by selecting these portions of the microphone audio data or by performing beamforming. This results in precise time alignment between the reference audio data and the microphone audio data, improving performance of the echo cancellation.
-
公开(公告)号:US10553236B1
公开(公告)日:2020-02-04
申请号:US15906967
申请日:2018-02-27
Applicant: Amazon Technologies, Inc.
Inventor: Robert Ayrapetian , Trausti Thor Kristjansson , Philip Ryan Hilmes , Carlo Murgia
IPC: G10L21/0232 , H04B17/336 , H04B7/015 , G10L25/21 , G10L21/0364 , G10L25/84 , G10L21/0208
Abstract: A system configured to improve noise cancellation by reducing attenuation of local speech in proximity to a device. When the local speech is present in both a target signal and a reference signal, performing noise cancellation to remove the reference signal inadvertently attenuates the local speech. To prevent this, the system may perform first noise cancellation to identify frequency bands associated with the local speech and may generate a modified reference signal based on the frequency bands. For example, the system may generate the modified reference signal by applying attenuation to first frequencies associated with the local speech and/or gain to second frequencies that are not associated with the local speech. The system may generate final output audio data by performing noise cancellation using the modified reference signal.
-
公开(公告)号:US10522167B1
公开(公告)日:2019-12-31
申请号:US15895313
申请日:2018-02-13
Applicant: Amazon Technologies, Inc.
Inventor: Robert Ayrapetian , Philip Ryan Hilmes , Trausti Thor Kristjansson
IPC: G10L21/0216 , G10L21/0264 , G10L25/30 , G10L17/18 , G10L21/0208
Abstract: A system configured to improve beamforming by using deep neural networks (DNNs). The system can use one trained DNN to focus on a first person speaking an utterance (e.g., target user) and one or more trained DNNs to focus on noise source(s) (e.g., wireless loudspeaker(s), a second person speaking, other localized sources of noise, or the like). The DNNs may generate time-frequency mask data that indicates individual frequency bands that correspond to the particular source detected by the DNN. Using this mask data, a beamformer can generate beamformed audio data that is specific to a source of noise. The system may perform noise cancellation to isolate first beamformed audio data associated with the target user by removing second beamformed audio data associated with noise source(s).
-
-
-
-
-
-
-
-
-