摘要:
Binaural rendering a multi-channel audio signal into a binaural output signal (24) is described. The multi-channel audio signal comprises a stereo downmix signal (18) into which a plurality of audio signals are downmixed, and side information comprising a downmix information (DMG, DCLD) indicating, for each audio signal, to what extent the respective audio signal has been mixed into a first channel and a second channel of the stereo downmix signal (18), respectively, as well as object level information of the plurality of audio signals and inter-object cross correlation information describing similarities between pairs of audio signals of the plurality of audio signals. Based on a first rendering prescription, a preliminary binaural signal (54) is computed from the first and second channels of the stereo downmix signal (18). A decorrelated signal (X n,k d) is generated as an perceptual equivalent to a mono downmix (58) of the first and second channels of the stereo downmix signal (18) being, however, decorrelated to the mono downmix (58). Depending on a second rendering prescription (P2 1,m ), a corrective binaural signal (64) is computed from the decorrelated signal (62) and the preliminary binaural signal (54) is mixed with the corrective binaural signal (64) to obtain the binaural output signal (24).
摘要:
A downmixer for providing a downmix signal on the basis of a plurality of input signals is configured to determine a magnitude value of a spectral domain value of the downmix signal on the basis of a loudness information of the input signals. The downmixer is configured to determine a phase value of the spectral domain value of the downmix signal and the downmixer is configured to apply the phase value in order to obtain a complex valued number representation of the spectral domain value of the downmix signal on the basis of the magnitude value of the spectral domain value of the downmix signal. An audio encoder uses such a downmixer. A method for downmixing and a computer program are also described.
摘要:
An apparatus for generating a filtered audio signal from an audio input signal includes a filter information determiner being configured to determine filter information depending on input height information wherein the input height information depends on a height of a virtual sound source. Moreover, the apparatus includes a filter unit being configured to filter the audio input signal to obtain the filtered audio signal depending on the filter information. The filter information determiner is configured to determine the filter information using selecting, depending on the input height information, a selected filter curve from a plurality of filter curves, or the filter information determiner is configured to determine the filter information using determining a modified filter curve by modifying a reference filter curve depending on the elevation information.
摘要:
An apparatus for generating loudspeaker signals is provided. The apparatus comprises an object metadata processor (110) and an object renderer (120). The object renderer (120) is configured to receive an audio object. The object metadata processor (110) is configured to receive metadata, comprising an indication on whether the audio object is screen-related, and further comprising a first position of the audio object. The object metadata processor (110) is configured to calculate a second position of the audio object depending on the first position of the audio object and depending on a size of a screen, if the audio object is indicated in the metadata as being screen-related. The object renderer (120) is configured to generate the loudspeaker signals depending on the audio object and depending on position information. The object metadata processor (110) is configured to feed the first position of the audio object as the position information into the object renderer (120), if the audio object is indicated in the metadata as being not screen-related. The object metadata processor (110) is configured to feed the second position of the audio object as the position information into the object renderer (120), if the audio object is indicated in the metadata as being screen-related.
摘要:
A device for generating a binaural signal based on a multi-channel signal representing a plurality of channels and intended for reproduction by a speaker configuration having a virtual sound source position associated to each channel, is described. It comprises a correlation reducer for differently processing, and thereby reducing a correlation between, at least one of a left and a right channel of the plurality of channels, a front and a rear channel of the plurality of channels, and a center and a non-center channel of the plurality of channels, in order to obtain an inter-similarity reduced set of channels; a plurality of directional filters, a first mixer for mixing outputs of the directional filters modeling the acoustic transmission to the first ear canal of the listener, and a second mixer for mixing outputs of the directional filters modeling the acoustic transmission to the second ear canal of the listener. According to another aspect, a center level reduction for forming the downmix for a room processor is performed. According to even another aspect, an inter-similarity decreasing set of head-related transfer functions is formed.
摘要:
Audio data processor, comprising: a receiver interface for receiving encoded audio data and metadata related to the encoded audio data; a metadata parser for parsing the metadata to determine an audio data manipulation possibility; an interaction interface for receiving an interaction input and for generating, from the interaction input, interaction control data related to the audio data manipulation possibility; and a data stream generator for obtaining the interaction control data and the encoded audio data and the metadata and for generating an output data stream, the output data stream comprising the encoded audio data, at least a portion of the metadata, and the interaction control data.
摘要:
A method for processing an audio signal (400) in accordance with a room impulse response is described. The method includes separately processing the audio signal (400) with an early part and a late reverberation of the room impulse response, wherein the separate processing includes processing the audio signal (400) with the early part of the room impulse response during a first (422) process, processing the audio signal (400) with the late reverberation of the room impulse response or with the synthetic reverberation during a second process (424) that is different and separate from the first process (422), and changing from the first process (422) to the second process (424) at a transition from the early part to the late reverberation in the room impulse response. The method further includes combining (432) the audio signal (428) processed with the early part of the room impulse response and the audio signal (430) processed with the late reverberation of the room impulse response or with the synthetic reverberation. The transition from the early part to the late reverberation in the room impulse response is a time when a correlation measure reaches a threshold. The correlation measure describes with regard to the room impulse response a similarity of a decay in acoustic energy comprising an initial state and of the decay in acoustic energy starting at a point in time, said point in time following the initial state over a predefined frequency range. The threshold is set dependent on the correlation measure for said point in time, said point in time being a time of a selected one of the early reflections in the early part (301, 302) of the room impulse response, and the selected one of the early reflections is the first reflection.