摘要:
An apparatus for generating one or more audio output channels is provided. The apparatus comprises a parameter processor (110) for calculating mixing information and a downmix processor (120) for generating the one or more audio output channels. The downmix processor (120) is configured to receive an audio transport signal comprising one or more audio transport channels. One or more audio channel signals are mixed within the audio transport signal, and one or more audio object signals are mixed within the transport signal, and wherein the number of the one or more audio transport channels is smaller than the number of the one or more audio channel signals plus the number of the one or more audio object signals. The parameter processor (110) is configured to receive downmix information indicating information on how the one or more audio channel signals and the one or more audio object signals are mixed within the one or more audio transport channels, and wherein the parameter processor (110) is configured to receive covariance information. Moreover, the parameter processor (110) is configured to calculate the mixing information depending on the downmix information and depending on the covariance information. The downmix processor (120) is configured to generate the one or more audio output channels from the audio transport signal depending on the mixing information. The information indicates a level difference information for at least one of the one or more audio channel signals and further indicates a level difference information for at least one of the one or more audio object signals. However, the covariance information does not indicate correlation information for any pair of one of the one or more audio channel signals and one of the one or more audio object signals.
摘要:
An apparatus for generating one or more audio output channels is provided. The apparatus comprises a parameter processor (110) for calculating output channel mixing information and a downmix processor (120) for generating the one or more audio output channels. The downmix processor (120) is configured to receive an audio transport signal comprising one or more audio transport channels, wherein two or more audio object signals are mixed within the audio transport signal, and wherein the number of the one or more audio transport channels is smaller than the number of the two or more audio object signals. The audio transport signal depends on a first mixing rule and on a second mixing rule. The first mixing rule indicates how to mix the two or more audio object signals to obtain a plurality of premixed channels. Moreover, the second mixing rule indicates how to mix the plurality of premixed channels to obtain the one or more audio transport channels of the audio transport signal. The parameter processor (110) is configured to receive information on the second mixing rule, wherein the information on the second mixing rule indicates how to mix the plurality of premixed signals such that the one or more audio transport channels are obtained. Moreover, the parameter processor (110) is configured to calculate the output channel mixing information depending on an audio objects number indicating the number of the two or more audio object signals, depending on a premixed channels number indicating the number of the plurality of premixed channels, and depending on the information on the second mixing rule. The downmix processor (120) is configured to generate the one or more audio output channels from the audio transport signal depending on the output channel mixing information.
摘要:
Audio encoder for encoding audio input data (101) to obtain audio output data (501) comprises an input interface (100) for receiving a plurality of audio channels, a plurality of audio objects and metadata related to one or more of the plurality of audio objects; a mixer (200) for mixing the plurality of objects and the plurality of channels to obtain a plurality of pre-mixed channels, each pre-mixed channel comprising audio data of a channel and audio data of at least one object; a core encoder (300) for core encoding core encoder input data; and a metadata compressor (400) for compressing the metadata related to the one or more of the plurality of audio objects, wherein the audio encoder is configured to operate in at least one mode of the group of two modes comprising a first mode, in which the core encoder is configured to encode the plurality of audio channels and the plurality of audio objects received by the input interface as core encoder input data, and a second mode, in which the core encoder (300) is configured for receiving, as the core encoder input data, the plurality of pre-mixed channels generated by the mixer (200).
摘要:
A decoder for generating an audio output signal comprising one or more audio output channels from a downmix signal comprising a plurality of time-domain downmix samples is provided. The downmix signal encodes two or more audio object signals. The decoder comprises a window-sequence generator (134) for determining a plurality of analysis windows, wherein each of the analysis windows comprises a plurality of time-domain downmix samples of the downmix signal. Each analysis window of the plurality of analysis windows has a window length indicating the number of the time-domain downmix samples of said analysis window. The window-sequence generator (134) is configured to determine the plurality of analysis windows so that the window length of each of the analysis windows depends on a signal property of at least one of the two or more audio object signals. Moreover, the decoder comprises a t/f-analysis module (135) for transforming the plurality of time-domain downmix samples of each analysis window of the plurality of analysis windows from a time-domain to a time-frequency domain depending on the window length of said analysis window, to obtain a transformed downmix. Furthermore, the decoder comprises an un-mixing unit (136) for un-mixing the transformed downmix based on parametric side information on the two or more audio object signals to obtain the audio output signal. Moreover, an encoder is provided.
摘要:
An apparatus for deriving a multi-channel audio signal comprising a front-loudspeaker signal and a back-loudspeaker signal from an audio signal, the apparatus comprising an apparatus for generating an ambient signal from the audio signal, wherein the apparatus for generating the ambient signal from the audio signal comprises means for a lossy compression of a representation of the audio signal so as to obtain a compressed representation of the audio signal; and means for calculating a difference between the compressed representation of the audio signal and the representation of the audio signal so as to obtain a discrimination representation, the discrimination representation describing the difference between the representation of the audio signal and the compressed representation of the audio signal and describing those portions of the audio signal not played back in the lossily compressed representation, and wherein the means for lossy compression is configured such that signal portions exhibiting regular distribution of the energy or carrying a large signal energy are preferred to be included in the compressed representation; wherein the discrimination representation forms the ambient signal; an apparatus for providing the audio signal or a signal derived therefrom as the front-loudspeaker signal; and a back-loudspeaker-signal-providing apparatus for providing the ambient signal provided by the apparatus for generating the ambient signal or a signal derived therefrom as the back-loudspeaker signal. An apparatus for generating an ambient signal from an audio signal comprises means for lossy compression of a representation of the audio signal so as to obtain a compressed representation of the audio signal describing a compressed audio signal. The apparatus for generating the ambient signal further comprises means for calculating a difference between the compressed representation of the audio signal and the representation of the audio signal so as to obtain a discrimination representation. The apparatus further comprises means for providing the ambient signal using the discrimination representation.
摘要:
An apparatus for improving a perceived quality of sound reproduction of an audio output signal is provided. The apparatus comprises an active noise cancellation unit (110) for generating a noise cancellation signal based on an environmental audio signal, wherein the environmental audio signal comprises noise signal portions, the noise signal portions resulting from recording environmental noise. Moreover, the apparatus comprises a residual noise characteristics estimator (120) for determining a residual noise characteristic depending on the environmental noise and the noise cancellation signal. Furthermore, the apparatus comprises a perceptual noise compensation unit (130) for generating a noise-compensated signal based on an audio target signal and based on the residual noise characteristic. Moreover, the apparatus comprises a combiner (140) for combining the noise cancellation signal and the noise-compensated signal to obtain the audio output signal.
摘要:
A decoder for decoding an encoded audio signal to obtain a phase-adjusted audio signal is provided. The decoder comprises a decoding unit (110) and a phase adjustment unit (120). The decoding unit (110) is adapted to decode the encoded audio signal to obtain a decoded audio signal. The phase adjustment unit (120) is adapted to adjust the decoded audio signal to obtain the phase-adjusted audio signal. The phase adjustment unit (120) is configured to receive control information depending on a vertical phase coherence of the encoded audio signal. Moreover, the phase adjustment unit (120) is adapted to adjust the decoded audio signal based on the control information.
摘要:
An apparatus for generating a merged audio data stream is provided. The apparatus comprises a demultiplexer (180) for obtaining a plurality of single-layer audio data streams, wherein the demultiplexer (180) is adapted to receive one or more input audio data streams, wherein each input audio data stream comprises one or more layers, wherein the demultiplexer (180) is adapted to demultiplex each one of the input audio data streams having one or more layers into two or more demultiplexed audio data streams having exactly one layer, such that the two or more demultiplexed audio data streams together comprise the one or more layers of the input audio data stream. Furthermore, the apparatus comprises a merging module (190) for generating the merged audio data stream, having one or more layers, based on the plurality of single-layer audio data streams. Each layer of the input data audio streams, of the demultiplexed audio data streams, of the single-layer data streams and of the merged audio data stream comprises a pressure value of a pressure signal, a position value and a diffuseness value as audio data.
摘要:
An audio mixer for mixing a plurality of audio tracks to a mixture signal comprises a semantic command interpreter (30; 35) for receiving a semantic mixing command and for deriving a plurality of mixing parameters for the plurality of audio tracks from the semantic mixing command; an audio track processor (70; 75) for processing the plurality of audio tracks in accordance with the plurality of mixing parameters; and an audio track combiner (76) for combining the plurality of audio tracks processed by the audio track processor into the mixture signal (MS). A corresponding method comprises: receiving a semantic mixing command; deriving a plurality of mixing parameters for the plurality of audio tracks from the semantic mixing command; processing the plurality of audio tracks in accordance with the plurality of mixing parameters; and combining the plurality of audio tracks resulting from the processing of the plurality of audio tracks to form the mixture signal.
摘要:
Zum Erzeugen eines Umgebungssignals, das zur Ausstrahlung über Lautsprecher geeignet ist, für die kein eigenes Lautsprechersignal existiert, also beispielsweise für Surround-Kanäle, ist ein Transienten-Detektor(11) vorgesehen, um einen Transientenzeitraum zu detektieren. Ein Synthesesignalgenerator(12) erzeugt ein Synthesesignal, das einerseits die Transientenbedingung und andererseits die Kontinuitätsbedingung für das Synthesesignal erfüllt. Ein Signalsubstituierer(14) ersetzt dann einen Abschnitt des Untersuchungssignals durch das Synthesesignal, um ein Umgebungssignal für die Surround-Kanäle zu erhalten.