-
公开(公告)号:US11812237B2
公开(公告)日:2023-11-07
申请号:US17553976
申请日:2021-12-17
IPC分类号: G10L21/02 , H04R3/00 , H04R5/04 , H04R5/027 , G10L21/0224 , G06F3/16 , G10L21/0272 , G10L21/0208 , G10L21/0216 , G10L25/93 , G10L25/51 , H03H21/00 , G10L25/78
CPC分类号: H04R3/005 , G06F3/167 , G10L21/02 , G10L21/0208 , G10L21/0224 , G10L21/0272 , H04R5/027 , H04R5/04 , G10L25/51 , G10L25/78 , G10L25/93 , G10L2021/02082 , G10L2021/02166 , H03H21/0012
摘要: Techniques for improving adaptive interference cancellation (AIC) using cascaded AIC algorithms are described. To improve an accuracy of detecting speech, a device may perform a first stage of AIC to generate isolated audio data and may generate speech mask data indicating time windows when speech is detected in the isolated audio data. Based on the speech mask data, the device may perform second AIC to generate output audio data, with adaptation of the adaptive filter enabled when the speech is not detected and disabled when the speech is detected. Thus, the first AIC improves the accuracy with which the device detects that speech is present and the second AIC reduces distortion in the output audio data by not updating filter coefficient values when the speech is present. The first AIC may use playback audio data, microphone audio data or beamformed audio data as reference signals.
-
公开(公告)号:US20220109929A1
公开(公告)日:2022-04-07
申请号:US17553976
申请日:2021-12-17
IPC分类号: H04R3/00 , H04R5/027 , G06F3/16 , H04R5/04 , G10L21/0224 , G10L21/0208 , G10L21/02 , G10L21/0272
摘要: Techniques for improving adaptive interference cancellation (AIC) using cascaded AIC algorithms are described. To improve an accuracy of detecting speech, a device may perform a first stage of AIC to generate isolated audio data and may generate speech mask data indicating time windows when speech is detected in the isolated audio data. Based on the speech mask data, the device may perform second AIC to generate output audio data, with adaptation of the adaptive filter enabled when the speech is not detected and disabled when the speech is detected. Thus, the first AIC improves the accuracy with which the device detects that speech is present and the second AIC reduces distortion in the output audio data by not updating filter coefficient values when the speech is present. The first AIC may use playback audio data, microphone audio data or beamformed audio data as reference signals.
-
公开(公告)号:US11217235B1
公开(公告)日:2022-01-04
申请号:US16686808
申请日:2019-11-18
发明人: Wai Chung Chu , Anshuman Ganguly , Carlo Murgia
IPC分类号: G10L15/20 , G10L25/21 , H04R1/40 , H04R3/00 , G10L21/0232 , G10L15/22 , G05D1/00 , G10L25/84 , G10L21/0208
摘要: A device capable of autonomous motion may move in response to a user speaking an utterance, such as a command. Before moving, the device processes audio data received from a microphone array to identify different audio signals arriving at the device from different directions. Based on properties of the audio signals, the device determines which of the audio signals are merely reflections of other audio.
-
公开(公告)号:US10586534B1
公开(公告)日:2020-03-10
申请号:US15717503
申请日:2017-09-27
IPC分类号: G10L21/02 , G10L15/20 , G10L15/22 , G10L25/84 , G10L21/0232 , G10L15/30 , G10L15/08 , G10L21/0208
摘要: Devices and techniques are generally described for control of a voice-controlled device using acoustic echo cancellation statistics. A reference signal representing the audio stream may be sent to an acoustic echo cancellation (AEC) unit. A microphone may receive an input audio signal and send the input audio signal to the AEC unit. The AEC unit may attenuate at least a part of the input audio signal. AEC statistics related to the attenuation of at least the part of the input audio signal may be determined over a first period of time. A wake-word in the input audio signal may be detected during the first period of time. A determination may be made that the wake-word is part of the playback of the audio stream based at least in part on the AEC statistics.
-
公开(公告)号:US09966059B1
公开(公告)日:2018-05-08
申请号:US15697088
申请日:2017-09-06
IPC分类号: G10K11/178 , G10L21/0216 , H04R1/08 , H04R1/32 , H04R1/46 , H04R3/00
CPC分类号: G10K11/178 , G10K11/346 , G10L21/0208 , G10L21/0216 , G10L2021/02165 , G10L2021/02166 , H04R1/08 , H04R1/32 , H04R1/46 , H04R3/002
摘要: An acoustic interference cancellation system that performs beamforming using a subset of microphones from a microphone array. For example, a first group of microphones from an array can be used to generate target signals that focus on the direction of the desired speech in the audio and a second group of microphones from the array can be used to generate reference signals that include the environmental noise, audio from a loudspeaker, etc. The reference signals of the second group of microphones can then be used to isolate the actual speech from the target signals of the first group of microphones. The microphone array can be three dimensional, allowing a device to simplify beamforming calculations by selecting subsets of microphones along different planes. In addition, directional microphones and remote microphones may be used to improve a quality of the reference signals.
-
公开(公告)号:US11386911B1
公开(公告)日:2022-07-12
申请号:US16915037
申请日:2020-06-29
IPC分类号: G10L21/0232 , H04R3/04 , H04R5/04 , H04R3/00 , G10L21/0208
摘要: A system configured to improve audio processing by performing dereverberation and noise reduction during a communication session. The system may apply a two-channel dereverberation algorithm by calculating coherence-to-diffuse ratio (CDR) values and calculating dereverberation (DER) gain values based on the CDR values. While the DER gain values may be calculated at a first stage within the pipeline, the device may apply the DER gain values at a second stage within the pipeline. For example, the device may calculate the DER gain values prior to performing residual echo suppression (RES) processing but may apply the DER gain values after performing RES processing, in order to avoid excessive attenuation of the local speech. In addition to removing reverberation, the DER gain values also remove diffuse noise components, reducing an amount of noise reduction required. Thus, the device may soften noise reduction when the DER gain values are applied.
-
公开(公告)号:US10972834B1
公开(公告)日:2021-04-06
申请号:US16787580
申请日:2020-02-11
发明人: Kuan-Chieh Yen , Daniel Wayne Harris , Carlo Murgia , Taro Kimura
摘要: This disclosure describes techniques for detecting voice commands from a user of an ear-based device. The ear-based device may include an in-ear facing microphone to capture sound emitted in an ear of the user, and an exterior facing microphone to capture sound emitted in an exterior environment of the user. The in-ear microphone may generate an inner audio signal representing the sound emitted in the ear, and the exterior microphone may generate an outer audio signal representing sound from the exterior environment. The ear-based device may compute a ratio of a power of the inner audio signal to the outer audio signal and may compare this ratio to a threshold. If the ratio is larger than the threshold, the ear-based device may detect the voice of the user. Further, the ear-based device may set a value of the threshold based on a level of acoustic seal of the ear-based device.
-
公开(公告)号:US11950062B1
公开(公告)日:2024-04-02
申请号:US17709563
申请日:2022-03-31
发明人: Wai Chung Chu , Carlo Murgia
摘要: A system configured to improve sound source localization (SSL) processing by reducing a number of direction vectors and grouping the direction vectors into direction cells is provided. The system performs clustering to generate a smaller set of direction vectors included in a delay-direction codebook, reducing a size of the codebook to the number of unique delay vectors. In addition, the system groups the direction vectors into direction cells having a regular structure (e.g., predetermined uniformity and/or symmetry), which simplifies SSL processing and results in a substantial reduction in computational cost. The system may also select between multiple codebooks and/or dynamically adjust the codebook to compensate for changes to the microphone array. For example, a device with a microphone array fixed to a display that can tilt may adjust the codebook based on a tilt angle of the display to improve accuracy.
-
公开(公告)号:US11528571B1
公开(公告)日:2022-12-13
申请号:US17150599
申请日:2021-01-15
发明人: Ian Ernan Liu , Carlo Murgia , Huiqun Han , Zhouhui Miao
摘要: A system configured to perform microphone occlusion event detection. When a device detects a microphone occlusion event, the device will modify audio processing performed prior to speech processing, such as by disabling spatial processing and only processing audio data from a single microphone. The device detects the microphone occlusion event by determining inter-level difference (ILD) values between two microphone signals and using the ILD values as input features to a classifier. For example, when a far-end reference signal is inactive, the classifier may process a first ILD value within a high frequency band. However, when the far-end reference signal is active, the classifier may process the first ILD value and a second ILD value within a low frequency band.
-
公开(公告)号:US11290802B1
公开(公告)日:2022-03-29
申请号:US15883888
申请日:2018-01-30
发明人: Dibyendu Nandy , Milos Jorgovanovic , Carlo Murgia
IPC分类号: H04R1/10 , G10L15/08 , H04R1/40 , H04R3/00 , G10L25/21 , H04R3/04 , G10L15/22 , G10L25/78 , G10L15/30
摘要: Techniques for detecting a voice command from a user of a hearable device. The hearable device may include an in-ear facing microphone to capture sound emitted from an ear of the user, and an exterior facing microphone to capture sound emitted from an exterior environment of the user. The in-ear microphone may generate an in-ear audio signal representing the sound emitted from the ear, and the exterior microphone may generate an exterior audio signal representing sound from the exterior environment. The hearable device may include components to determine correlations or similarities between the in-ear audio signal and exterior audio signal, which indicate that the audio signals represent sound emitted from the user. Further, the components may perform voice activity detection to determine that the sound emitted from the user is a voice command, and proceed to perform further voice-processing techniques.
-
-
-
-
-
-
-
-
-