Abstract:
The technology described in this document can be embodied in a method that includes receiving an input signal representing audio captured by a microphone of an active noise reduction (ANR) headphone, processing, by one or more processing devices, a portion of the input signal to determine a noise level in the input signal, and determining that the noise level satisfies a threshold condition. The method also includes, in response to determining that the noise level satisfies the threshold condition, generating an output signal in which ANR processing on the input signal is controlled in accordance with a target loudness level of the output signal, and driving an acoustic transducer of the ANR headphone using the output signal.
Abstract:
Techniques for EMD-based signal de-noising are disclosed that use statistical characteristics of IMFs to identify information-carrying IMFs for the purposes of partially reconstructing the identified relevant IMFs into a de-noised signal. The present disclosure has identified that the statistical characteristics of IMFs with noise tend to follow a generalized Gaussian distribution (GGD) versus only a Gaussian or Laplace distribution. Accordingly, a framework for relevant IMF selection is disclosed that includes, in part, performing a null hypothesis test against a distribution of each IMF derived from the use of a generalized probability density function (PDF). IMFs that contribute more noise than signal may thus be identified through the null hypothesis test. Conversely, the aspects and embodiments disclosed herein enable the determination of which IMFs have a contribution of more signal than noise. Thus, a signal may be partially reconstructed based on the predominately information-carrying IMFs to result in de-noised output signal.
Abstract:
Systems and methods for assisting automatic speech recognition (ASR) are provided. An example method includes generating, by a mobile device, a plurality of instantiations of a speech component in a captured audio signal, each instantiation of the plurality of instantiations being in support of a particular hypothesis regarding the speech component. At least two instantiations of the plurality of instantiations are then sent to a remote ASR engine. The remote ASR engine is configured to recognize at least one word based on the at least two of the plurality of instantiations and a user context, according to various embodiments. This recognition can include selecting one of the instantiations of the speech component from the plurality of instantiations. The plurality of instantiations may be generated by noise suppression of the captured audio signal with different degrees of aggressiveness. In some embodiments, the plurality of instantiations is generated by synthesizing the speech component from synthetic speech parameters obtained by a spectral analysis of the captured audio signal.
Abstract:
An estimated system gain spectrum of an acoustic system is generated, and updated in real-time to respond to changes in the acoustic system. Peak gains in the estimated system gain spectrum are tracked as the estimated system gain spectrum is updated. Based on the tracking, at least one frequency at which the estimated system gain spectrum is currently exhibiting a peak gain is identified. Based on the identification of the at least one frequency, an audio equalizer is controlled to apply, to a first speech containing signal to be played out via an audio output device of the audio device and/or to a second speech containing signal received via an audio input device of the audio device, an equalization filter to reduce the level of that signal at the identified frequency. The equalization filter is applied continuously throughout intervals of both speech activity and speech inactivity in that signal.
Abstract:
Apparatus and method for remedying an auditory defect, wherein the following steps are performed in the method: receiving an incoming sound signal as an input signal, the incoming sound signal having at least one channel, adjusting the frequency response of the at least one channel of the input signal by filtering out frequencies outside the frequency range of speech of a specific language, outputting the filtered signal of at least one channel.
Abstract:
Speech received from a microphone array is enhanced. In one example, a noise filtering system receives audio from the plurality of microphones, determines a beamformer output from the received audio, applies a first auto-regressive moving average smoothing filter to the beamformer output, determines noise estimates from the received audio, applies a second auto-regressive moving average smoothing filter to the noise estimates, and combines the first and second smoothing filter outputs to produce a power spectral density output of the received audio with reduced noise.
Abstract:
A far end signal is received at a device, a marker signal is inserted into the far end signal and the far end signal with the marker signal is played on a speaker. A near end signal is received via a microphone and the marker signal is detected in said received near end signal. The detected marker signal is used to determine a delay that is then used to cancel at least some of an echo in the near end signal. The marker may be ultrasonic. The echo canceller and other processing may run at a lower sampling frequency than the marker detection.
Abstract:
Example embodiments disclosed herein relate to user experience oriented audio signal processing. There is provided a method for user experience oriented audio signal processing. The method includes obtaining a first audio signal from an audio sensor of an electronic device; computing, based on the first audio signal, a compensation factor for an acoustic path from the electronic device to a listener and applying the compensation factor to a second audio signal outputted from the electronic device. Corresponding system and computer program products are disclosed.