摘要:
A signal separator (100) for determining a first output signal (a1) describing an audio content of a useful-signal source in a first microphone signal, and for determining a second output (a2) signal describing an audio content of the useful-signal source in a second microphone signal comprises a source separator (130) for receiving the two microphone signals and for separating audio contents of at least two signal sources. The source separator (130) is designed to obtain a first partial signal (y1) essentially describing an audio content of a first signal source, and representing the first output signal, and so as to obtain a second partial signal essentially (y2) describing an audio content of a second signal source. The source separator (130) is designed to adjust parameters of a processing specification for generating the first partial signal from the microphone signal in such a manner that a distortion of the first partial signal relative to the first microphone signal is smaller than a maximum distortion, and to adjust parameters of a processing specification for generating the second partial signal from the microphone signals in such a manner that a distortion of the second partial signal relative to the second microphone signal is smaller than a maximum distortion. The signal separator further includes a signal remover for removing the second partial signal from the second microphone signal so as to obtain the second output signal, wherein the second partial signal is reduced. The signal separator offers the advantage that a multi-channel representation of the microphone signals which is corrected in terms of an interference signal may be achieved in a particularly simple manner by using a secondary condition in the source separator.
摘要:
A method for generating and playing audio signals and a system for processing audio signals are disclosed. The method for generating audio signals includes: generating distance information about an audio signal corresponding to a view point position, according to obtained auxiliary video and direction information about the audio signal, where the auxiliary video is a disparity map or a depth map; encoding the direction information and distance information about the audio signal, and sending the encoded information. The apparatus for generating audio signals includes an audio signal distance information obtaining module and an audio signal encoding module. With the present invention, the position information, including direction information and distance information, about the audio signal may be obtained accurately in combination with a three-dimensional video signal and a three-dimensional audio signal, without increasing the size of a microphone array, and the audio signal is sent and played.
摘要:
An enhanced blind source separation technique is provided to improve separation of highly correlated signal mixtures. A beamforming algorithm is used to precondition correlated first and second input signals in order to avoid indeterminacy problems typically associated with blind source separation. The beamforming algorithm may apply spatial filters to the first signal and second signal in order to amplify signals from a first direction while attenuating signals from other directions. Such directionality may serve to amplify a desired speech signal in the first signal and attenuate the desired speech signal from the second signal. Blind source separation is then performed on the beamformer output signals to separate the desired speech signal and the ambient noise and reconstruct an estimate of the desired speech signal. To enhance the operation of the beamformer and/or blind source separation, calibration may be performed at one or more stages.
摘要:
A signal separator (100) for determining a first output signal (a1) describing an audio content of a useful-signal source in a first microphone signal, and for determining a second output (a2) signal describing an audio content of the useful-signal source in a second microphone signal comprises a source separator (130) for receiving the two microphone signals and for separating audio contents of at least two signal sources. The source separator (130) is designed to obtain a first partial signal (y1) essentially describing an audio content of a first signal source, and representing the first output signal, and so as to obtain a second partial signal essentially (y2) describing an audio content of a second signal source. The source separator (130) is designed to adjust parameters of a processing specification for generating the first partial signal from the microphone signal in such a manner that a distortion of the first partial signal relative to the first microphone signal is smaller than a maximum distortion, and to adjust parameters of a processing specification for generating the second partial signal from the microphone signals in such a manner that a distortion of the second partial signal relative to the second microphone signal is smaller than a maximum distortion. The signal separator further includes a signal remover for removing the second partial signal from the second microphone signal so as to obtain the second output signal, wherein the second partial signal is reduced. The signal separator offers the advantage that a multi-channel representation of the microphone signals which is corrected in terms of an interference signal may be achieved in a particularly simple manner by using a secondary condition in the source separator.
摘要:
A sound-source signal separating method including steps of enhancing a target sound-source signal in an input audio signal, the input audio signal being from a mixture of acoustic signals from a plurality of sound sources and picked up by a plurality of sound pickup devices, detecting a pitch of the target sound-source signal in the input audio signal, and separating the target sound-signal from the input audio signal based on the detected pitch and the sound-source signal enhanced in the sound-source signal enhancing step.
摘要:
A method separates acoustic signals generated by multiple acoustic sources, such as mixed speech spoken simultaneously by several speakers (101,102) in the same room. For each source, the acoustic signals are combined into a mixed signal acquired by multiple microphones (110), at least one for each source. The mixed signal is filtered, and the filtered signals are summed into a signal (131) from which features are extracted. A target sequence (151) through a factorial HMM is estimated, and filter parameters (161) are optimized accordingly. These steps are repeated until the filter parameters converge to optimal filtering parameters, which are then used to filter the mixed signal once more, and the summed output of this last filtering is the acoustic signal for a particular acoustic source.
摘要:
A signal processing system is provided which includes one or more receivers for receiving signals generated by a plurality of signal sources. The system has a memory for storing a predetermined function which gives, for a set of input signal values, a probability density for parameters of a respective signal model which is assumed to have generated the signals in the received signal values. The system applies a set of received signal values to the stored function to generate the probability density function and then draws samples from it. The system then analyses the drawn samples to determine parameter values representative of the signal from at least one of the sources.
摘要:
A computerized method extracts features from an acoustic signal generated from one or more sources. The acoustic signal are first windowed and filtered to produce a spectral envelope for each source. The dimensionality of the spectral envelope is then reduced to produce a set of features for the acoustic signal. The features in the set are clustered to produce a group of features for each of the sources. The features in each group include spectral features and corresponding temporal features characterizing each source. Each group of features is a quantitative descriptor that is also associated with a qualitative descriptor. Hidden Markov models are trained with sets of known features and stored in a database. The database can then be indexed by sets of unknown features to select or recognize like acoustic signals.