摘要:
The present document describes a method (600) for estimating source parameters of audio sources (101) from mix audio signals (102), with. The mix audio signals (102) comprise a plurality of frames. The mix audio signals (102) are representable as a mix audio matrix in a frequency domain and the audio sources (101) are representable as a source matrix in the frequency domain. The method (600) comprises updating (601) an un-mixing matrix (221) which is configured to provide an estimate of the source matrix from the mix audio matrix, based on a mixing matrix (225) which is configured to provide an estimate of the mix audio matrix from the source matrix. Furthermore, the method (600) comprises updating (602) the mixing matrix (225) based on the un-mixing matrix (221) and based on the mix audio signals (102). In addition, the method (600) comprises iterating (603) the updating steps (601, 602) until an overall convergence criteria is met.
摘要:
A sound source separation device includes a first microphone that picks up a first voice, a second microphone that picks up a second voice, a first crosstalk canceller that removes, from a voice signal of the first microphone, first crosstalk caused when the second voice is picked up by the first microphone, and a second crosstalk canceller that removes, from a voice signal of the second microphone, second crosstalk caused when the first voice is picked up by the second microphone. The first crosstalk canceller uses a voice signal in which the second crosstalk is removed from the voice signal of the second microphone to estimate and calculate a first interference signal indicative of a degree of the first crosstalk, and to remove the calculated first interference signal from the voice signal of the first microphone. The second crosstalk canceller uses a voice signal in which the first crosstalk is removed from the voice signal of the first microphone to estimate and calculate a second interference signal indicative of a degree of the second crosstalk, and to remove the calculated second interference signal from the voice signal of the second microphone.
摘要:
Provided are methods and systems for acoustic keystroke transient cancellation/suppression for user communication devices using a semi-blind adaptive filter model. The methods and systems are designed to overcome existing problems in transient noise suppression by taking into account some less-defective signal as side information on the transients and also accounting for acoustic signal propagation, including the reverberation effects, using dynamic models. The methods and systems take advantage of a synchronous reference microphone embedded in the keyboard of the user device, and utilize an adaptive filtering approach exploiting the knowledge of this keybed microphone signal.
摘要:
A method for generating and playing audio signals and a system for processing audio signals are disclosed. The method for generating audio signals includes: generating distance information about an audio signal corresponding to a view point position, according to obtained auxiliary video and direction information about the audio signal, where the auxiliary video is a disparity map or a depth map; encoding the direction information and distance information about the audio signal, and sending the encoded information. The apparatus for generating audio signals includes an audio signal distance information obtaining module and an audio signal encoding module. With the present invention, the position information, including direction information and distance information, about the audio signal may be obtained accurately in combination with a three-dimensional video signal and a three-dimensional audio signal, without increasing the size of a microphone array, and the audio signal is sent and played.
摘要:
There is provided a unique signal processing technique for localizing and characterizing each of a number of differently located acoustic sources. Specifically there is provided a method for auditory segregation of multiple voice inputs comprising the steps of: receiving a plurality of voice input signals from different source locations; filtering said voice input signals with head related transfer functions (HRTF) using a digital signal processor (DSP) thereby assigning the voice input signals to different locations in virtual auditory space; and changing the HRTF filtered voice input signals in two dimensions, wherein pitch is changed and the signal is filtered with different filters emulating vocal tracts of different sizes thereby further segregating the voice input signals from each other.
摘要:
The Direction of Arrival estimation algorithm ESPRIT is capable of estimating the angles of arrival of N narrowband source signals using M > N anechoic sensor mixtures from a uniform linear array (ULA). Using a similar parameter estimation step, the DUET Blind Source Separation algorithm can demix N > 2 speech signals using M = 2 anechoic mixtures of the signals. The present invention demixes N > M speech signals using M >= 2 anechoic mixtures.