Abstract:
Audio signal processing for adaptive de-reverberation uses a least mean squares (LMS) filter that has improved convergence over conventional LMS filters, making embodiments practical for reducing the effects of reverberation for use in many portable and embedded devices, such as smartphones, tablets, laptops, and hearing aids, for applications such as speech recognition and audio communication in general. The LMS filter employs a frequency-dependent adaptive step size to speed up the convergence of the predictive filter process, requiring fewer computational steps compared to a conventional LMS filter applied to the same inputs. The improved convergence is achieved at low memory consumption cost. Controlling the updates of the prediction filter in a high non-stationary condition of the acoustic channel improves the performance under such conditions. The techniques are suitable for single or multiple channels and are applicable to microphone array processing.
Abstract:
Techniques for EMD-based signal de-noising are disclosed that use statistical characteristics of IMFs to identify information-carrying IMFs for the purposes of partially reconstructing the identified relevant IMFs into a de-noised signal. The present disclosure has identified that the statistical characteristics of IMFs with noise tend to follow a generalized Gaussian distribution (GGD) versus only a Gaussian or Laplace distribution. Accordingly, a framework for relevant IMF selection is disclosed that includes, in part, performing a null hypothesis test against a distribution of each IMF derived from the use of a generalized probability density function (PDF). IMFs that contribute more noise than signal may thus be identified through the null hypothesis test. Conversely, the aspects and embodiments disclosed herein enable the determination of which IMFs have a contribution of more signal than noise. Thus, a signal may be partially reconstructed based on the predominately information-carrying IMFs to result in de-noised output signal.
Abstract:
Example embodiments disclosed herein relate to audio source separation with source direction determined based on iterative weighted component analysis. A method of separating audio sources in audio content is disclosed. The audio content includes a plurality of channels. The method includes obtaining multiple data samples from multiple time-frequency tiles of the audio content. The method also includes analyzing the data samples to generate multiple components in a plurality of iterations, wherein each of the components indicates a direction with a variance of the data samples, and wherein in each of the plurality of iterations, each of the data samples is weighted with a weight that is determined based on a selected component from the multiple components. The method further includes determining a source direction of the audio content based on the selected component for separating an audio source from the audio content. Corresponding system and computer program product of separating audio sources in audio content are also disclosed.
Abstract:
A camera system includes a first microphone, a second microphone, and a microphone controller. The first microphone and the second microphone are configured to capture audio over a time interval to produce a first captured audio signal and a second captured audio signal, respectively. The second captured audio signal is dampened relative to the first captured audio signal by a dampening factor. The microphone controller is configured to store the first captured audio signal in response to a determination that the first captured audio signal does not clip. In response to a determination that the first captured audio signal clips, the microphone controller is configured to identify a gain between the first captured audio signal and the second captured audio signal representative of the dampening factor, amplify the second captured audio signal based on the identified gain, and store the amplified second captured audio signal.
Abstract:
A sound processing system, method and program product for estimating parameters from binaural audio data. A system is provided having: a system for inputting binaural audio; and a binaural signal analyzer (BICAM) that: performs autocorrelation on both the first channel and second channel to generate a pair of autocorrelation functions; performs a first layer cross-correlation between the first channel and second channel to generate a first layer cross-correlation function; removes the center peak from the first layer cross-correlation function and a selected autocorrelation function to create a modified pair; performs a second layer cross-correlation between the modified pair to determine a temporal mismatch; generates a resulting function by replacing the first layer cross correlation function with the selected autocorrelation function using the temporal mismatch; and utilizes the resulting function to determine ITD parameters and interaural level difference ILD parameters of the direct sound components and reflected sound components.
Abstract:
雑音抑制性能を従来よりも向上させた信号処理技術を提供することを目的とする。第一成分抽出部14は、ターゲットエリアのパワースペクトル密度^φ S (ω,τ)から、ターゲットエリアから到来する音に由来する非定常成分^φ S (A) (ω,τ)及びインコヒーレントな雑音に由来する定常成分^φ S (B) (ω,τ)を時間平均処理により抽出する。第二成分抽出部15は、雑音エリアのパワースペクトル密度^φ N (ω,τ)から、干渉雑音に由来する非定常成分^φ N (A) (ω,τ)及びインコヒーレントな雑音に由来する定常成分^φ N (B) (ω,τ)を抽出する。
Abstract:
An audio perception system is described, comprising a capture module configured to capture acoustic speech signal information; a feature extraction module configured to extract features that identify a candidate unvoiced portion in an acoustic signal; a classification module configured to identify if the acoustic signal is or contains an unvoiced portion based on the extracted features; and a control module configured to generate a control signal to a sensory stimulation actuator for generating an aero- tactile stimulation to be delivered to a listener, the control signal based at least in part on a signal representing the identified unvoiced portion. Related methods are also described.
Abstract:
The present document relates to audio communication systems. In particular, the present document relates to the control of the level of audio signals within audio communication systems. A method for leveling a near-end audio signal (211) using a leveling gain (214) is described. The near-end audio signal (211) comprises a sequence of segments, wherein the sequence of segments comprises a current segment and one or more preceding segments. The method comprises determining a nuisance measure (416) which is indicative of an amount of aberrant voice activity within the sequence of segments of the near-end audio signal (211); and determining the leveling gain (214) for the current segment of the near-end audio signal (211), at least based on the leveling gain (214) for the one or more preceding segments of the near-end audio signal (211), and by taking into account - according to a variable degree - an estimate of the level of the current segment of the near-end audio signal (211); wherein the variable degree is dependent on the nuisance measure (416).