Abstract:
Electronic system for audio noise processing and noise reduction comprises: first and second noise estimators, selector and attenuator. First noise estimator processes first audio signal from voice beamformer (VB) and generate first noise estimate. VB generates first audio signal by beamforming audio signals from first and second audio pick-up channels. Second noise estimator processes first and second audio signal from noise beamformer (NB), in parallel with first noise estimator and generates second noise estimate. NB generates second audio signal by beamforming audio signals from first and second audio pick-up channels. First and second audio signals include frequencies in first and second frequency regions. Selector's output noise estimate may be a) second noise estimate in the first frequency region, and b) first noise estimate in the second frequency region. Attenuator attenuates first audio signal in accordance with output noise estimate. Other embodiments are also described.
Abstract:
A method for controlling a speech enhancement process in a far-end device, while engaged in a voice or video telephony communication session over a communication link with a near-end device. A near-end user speech signal is produced, using a microphone to pick up speech of a near-end user, and is analyzed by an automatic speech recognizer (ASR) without being triggered by an ASR trigger phrase or button. The recognized words are compared to a library of phrases to select a matching phrase, where each phrase is associated with a message that represents an audio signal processing operation. The message associated with the matching phrase is sent to the far-end device, which is used to configure the far-end device to adjust the speech enhancement process that produces the far-end speech signal. Other embodiments are also described.
Abstract:
Signals are received from audio pickup channels that contain signals from multiple sound sources. The audio pickup channels may include one or more microphones and one or more accelerometers. Signals representative of multiple sound sources are generated using a blind source separation algorithm. It is then determined which of those signals is deemed to be a voice signal and which is deemed to be a noise signal. The output noise signal may be scaled to match a level of the output voice signal, and a clean speech signal is generated based on the output voice signal and the scaled noise signal. Other aspects are described.
Abstract:
Signals are received from audio pickup channels that contain signals from multiple sound sources. The audio pickup channels may include one or more microphones and one or more accelerometers. Signals representative of multiple sound sources are generated using a blind source separation algorithm. It is then determined which of those signals is deemed to be a voice signal and which is deemed to be a noise signal. The output noise signal may be scaled to match a level of the output voice signal, and a clean speech signal is generated based on the output voice signal and the scaled noise signal. Other aspects are described.
Abstract:
Method of speech enhancement using Neural Network-based combined signal starts with training neural network offline which includes: (i) exciting at least one accelerometer and at least one microphone using training accelerometer signal and training acoustic signal, respectively. The training accelerometer signal and the training acoustic signal are correlated during clean speech segments. Training neural network offline further includes(ii) selecting speech included in the training accelerometer signal and in the training acoustic signal, and (iii) spatially localizing the speech by setting a weight parameter in the neural network based on the selected speech included in the training accelerometer signal and in the training acoustic signal. The neural network that is trained offline is then used to generate a speech reference signal based on an accelerometer signal from the at least one accelerometer and an acoustic signal received from the at least one microphone. Other embodiments are described.
Abstract:
Method of speech enhancement using Neural Network-based combined signal starts with training neural network offline which includes: (i) exciting at least one accelerometer and at least one microphone using training accelerometer signal and training acoustic signal, respectively. The training accelerometer signal and the training acoustic signal are correlated during clean speech segments. Training neural network offline further includes (ii) selecting speech included in the training accelerometer signal and in the training acoustic signal, and (iii) spatially localizing the speech by setting a weight parameter in the neural network based on the selected speech included in the training accelerometer signal and in the training acoustic signal. The neural network that is trained offline is then used to generate a speech reference signal based on an accelerometer signal from the at least one accelerometer and an acoustic signal received from the at least one microphone. Other embodiments are described.
Abstract:
Digital signal processing techniques for automatically reducing audible noise from a sound recording that contains speech. A noise suppression system uses two types of noise estimators, including a more aggressive one and less aggressive one. Decisions are made on how to select or combine their outputs into a usable noise estimate in a different speech and noise conditions. A 2-channel noise estimator is described. Other embodiments are also described and claimed.