摘要:
A decoder for generating an audio output signal comprising one or more audio output channels from a downmix signal is provided. The downmix signal encodes one or more audio object signals. The decoder comprises a control unit (181) for setting an activation indication to an activation state depending on a signal property of at least one of the one or more audio object signals. Moreover, the decoder comprises a first analysis module (182) for transforming the downmix signal to obtain a first transformed downmix comprising a plurality of first subband channels. Furthermore, the decoder comprises a second analysis module (183) for generating, when the activation indication is set to the activation state, a second transformed downmix by transforming at least one of the first subband channels to obtain a plurality of second subband channels, wherein the second transformed downmix comprises the first subband channels which have not been transformed by the second analysis module and the second subband channels. Moreover, the decoder comprises an un-mixing unit (184), wherein the un-mixing unit (184) is configured to un-mix the second transformed downmix, when the activation indication is set to the activation state, based on parametric side information on the one or more audio object signals to obtain the audio output signal, and to un-mix the first transformed downmix, when the activation indication is not set to the activation state, based on the parametric side information on the one or more audio object signals to obtain the audio output signal. Furthermore, an encoder is provided.
摘要:
An apparatus for generating an audio output signal to simulate a recording of a virtual microphone at a configurable virtual position in an environment includes a sound events position estimator and an information computation module. The former is adapted to estimate a sound source position indicating a position of a sound source in the environment, wherein the sound events position estimator is adapted to estimate the sound source position based on first and second direction information provided by first and second real spatial microphones, respectively, located at first and second real microphone positions in the environment, respectively. The information computation module is adapted to generate the audio output signal based on a first recorded audio input signal, on the first real microphone position, on the virtual position of the virtual microphone, and on the sound source position.
摘要:
A multi-mode audio signal decoder has a spectral value determinator to obtain sets of decoded spectral coefficients for a plurality of portions of an audio content and a spectrum processor configured to apply a spectral shaping to a set of spectral coefficients in dependence on a set of linear-prediction-domain parameters for a portion of the audio content encoded in a linear-prediction mode, and in dependence on a set of scale factor parameters for a portion of the audio content encoded in a frequency-domain mode. The audio signal decoder has a frequency-domain-to-time-domain converter configured to obtain a time-domain audio representation on the basis of a spectrally-shaped set of decoded spectral coefficients for a portion of the audio content encoded in the linear-prediction mode and for a portion of the audio content encoded in the frequency domain mode. An audio signal encoder is also described.
摘要:
An apparatus for generating an enhanced downmix signal on the basis of a multi-channel microphone signal has a spatial analyzer configured to compute a set of spatial cue parameters having a direction information describing a direction-of-arrival of a direct sound, a direct sound power information and a diffuse sound power information on the basis of the multi-channel microphone signal. The apparatus also has a filter calculator for calculating enhancement filter parameters in dependence on the direction information describing the direction-of-arrival of the direct sound, in dependence on the direct sound power information and in dependence on the diffuse sound power information. The apparatus also has a filter for filtering the microphone signal, or a signal derived therefrom, using the enhancement filter parameters, to obtain the enhanced downmix signal.
摘要:
An apparatus for providing one or more adjusted parameters for a provision of an upmix signal representation on the basis of a downmix signal representation and a parametric side information associated with the downmix signal representation has a parameter adjuster. The parameter adjuster is configured to receive one or more parameters and to provide, on the basis thereof, one or more adjusted parameters. The parameter adjuster is configured to provide the one or more adjusted parameters in dependence on an average value of a plurality of parameter values, such that a distortion of the upmix signal representation caused by the use of non-optimal parameters is reduced at least for parameters deviating from optimal parameters by more than a predetermined deviation.
摘要:
An apparatus for deriving a multi-channel audio signal comprising a front-loudspeaker signal and a back-loudspeaker signal from an audio signal, the apparatus comprising an apparatus for generating an ambient signal from the audio signal, wherein the apparatus for generating the ambient signal from the audio signal comprises means for a lossy compression of a representation of the audio signal so as to obtain a compressed representation of the audio signal; and means for calculating a difference between the compressed representation of the audio signal and the representation of the audio signal so as to obtain a discrimination representation, the discrimination representation describing the difference between the representation of the audio signal and the compressed representation of the audio signal and describing those portions of the audio signal not played back in the lossily compressed representation, and wherein the means for lossy compression is configured such that signal portions exhibiting regular distribution of the energy or carrying a large signal energy are preferred to be included in the compressed representation; wherein the discrimination representation forms the ambient signal; an apparatus for providing the audio signal or a signal derived therefrom as the front-loudspeaker signal; and a back-loudspeaker-signal-providing apparatus for providing the ambient signal provided by the apparatus for generating the ambient signal or a signal derived therefrom as the back-loudspeaker signal. An apparatus for generating an ambient signal from an audio signal comprises means for lossy compression of a representation of the audio signal so as to obtain a compressed representation of the audio signal describing a compressed audio signal. The apparatus for generating the ambient signal further comprises means for calculating a difference between the compressed representation of the audio signal and the representation of the audio signal so as to obtain a discrimination representation. The apparatus further comprises means for providing the ambient signal using the discrimination representation.
摘要:
For classifying different segments of a signal which has segments of at least a first type and second type, e.g. audio and speech segments, the signal is short-term classified on the basis of the at least one short-term feature extracted from the signal and a short-term classification result is delivered. The signal is also long-term classified on the basis of the at least one short-term feature and at least one long-term feature extracted from the signal and a long-term classification result is delivered. The short-term classification result and the long-term classification result are combined to provide an output signal indicating whether a segment of the signal is of the first type or of the second type.
摘要:
An apparatus for extracting an ambient signal from an input audio signal comprises a gain-value determinator configured to determine a sequence of time-varying ambient signal gain values for a given frequency band of the time-frequency distribution of the input audio signal in dependence on the input audio signal. The apparatus comprises a weighter configured to weight one of the sub-band signals representing the given frequency band of the time-frequency-domain representation with the time-varying gain values, to obtain a weighted sub-band signal. The gain-value determinator is configured to obtain one or more quantitative feature-values describing one or more features of the input audio signal and to provide the gain-value as a function of the one or more quantitative feature values such that the gain values are quantitatively dependent on the quantitative values. The gain value determinator is configured to determine the gain values such that ambience components are emphasized over non-ambience components in the weighted sub-band signal.
摘要:
Uncorrelated output signals are generated by an audio input signal for transient audio input signals in a multi-channel audio reconstruction in that the audio input signal is mixed with a representation of the audio input signal that is delayed by a delay time such that a first output signal corresponds to the audio input signal, and a second output signal corresponds to the delayed representation of the audio input signal in a first interval, wherein the first output signal of the delayed representation of the audio input signal and the second output signal in a second interval correspond to the audio input signal in a second time interval.
摘要:
Uncorrelated output signals are generated by an audio input signal for transient audio input signals in a multi-channel audio reconstruction in that the audio input signal is mixed with a representation of the audio input signal that is delayed by a delay time such that a first output signal corresponds to the audio input signal, and a second output signal corresponds to the delayed representation of the audio input signal in a first interval, wherein the first output signal of the delayed representation of the audio input signal and the second output signal in a second interval correspond to the audio input signal in a second time interval.