Abstract:
Methods for dialogue enhancing audio content, comprising providing a first audio signal presentation of the audio components, providing a second audio signal presentation, receiving a set of dialogue estimation parameters configured to enable estimation of dialogue components from the first audio signal presentation, applying said set of dialogue estimation parameters to said first audio signal presentation, to form a dialogue presentation of the dialogue components; and combining the dialogue presentation with said second audio signal presentation to form a dialogue enhanced audio signal presentation for reproduction on the second audio reproduction system, wherein at least one of said first and second audio signal presentation is a binaural audio signal presentation.
Abstract:
A method for representing a second presentation of audio channels or objects as a data stream, the method comprising the steps of: (a) providing a set of base signals, the base signals representing a first presentation of the audio channels or objects; (b) providing a set of transformation parameters, the transformation parameters intended to transform the first presentation into the second presentation; the transformation parameters further being specified for at least two frequency bands and including a set of multi-tap convolution matrix parameters for at least one of the frequency bands.
Abstract:
Methods for dialogue enhancing audio content, comprising providing a first audio signal presentation of the audio components, providing a second audio signal presentation, receiving a set of dialogue estimation parameters configured to enable estimation of dialogue components from the first audio signal presentation, applying said set of dialogue estimation parameters to said first audio signal presentation, to form a dialogue presentation of the dialogue components; and combining the dialogue presentation with said second audio signal presentation to form a dialogue enhanced audio signal presentation for reproduction on the second audio reproduction system, wherein at least one of said first and second audio signal presentation is a binaural audio signal presentation.
Abstract:
A method for representing a second presentation of audio channels or objects as a data stream, the method comprising the steps of: (a) providing a set of base signals, the base signals representing a first presentation of the audio channels or objects; (b) providing a set of transformation parameters, the transformation parameters intended to transform the first presentation into the second presentation; the transformation parameters further being specified for at least two frequency bands and including a set of multi-tap convolution matrix parameters for at least one of the frequency bands.
Abstract:
Methods for dialogue enhancing audio content, comprising providing a first audio signal presentation of the audio components, providing a second audio signal presentation, receiving a set of dialogue estimation parameters configured to enable estimation of dialogue components from the first audio signal presentation, applying said set of dialogue estimation parameters to said first audio signal presentation, to form a dialogue presentation of the dialogue components; and combining the dialogue presentation with said second audio signal presentation to form a dialogue enhanced audio signal presentation for reproduction on the second audio reproduction system, wherein at least one of said first and second audio signal presentation is a binaural audio signal presentation.
Abstract:
Methods for dialogue enhancing audio content, comprising providing a first audio signal presentation of the audio components, providing a second audio signal presentation, receiving a set of dialogue estimation parameters configured to enable estimation of dialogue components from the first audio signal presentation, applying said set of dialogue estimation parameters to said first audio signal presentation, to form a dialogue presentation of the dialogue components; and combining the dialogue presentation with said second audio signal presentation to form a dialogue enhanced audio signal presentation for reproduction on the second audio reproduction system, wherein at least one of said first and second audio signal presentation is a binaural audio signal presentation.
Abstract:
An encoding system (100) encodes a first (E1) and further (E2, E3) audio signals as a layered bitstream (B), wherein a quantizer for each frequency band of each signal is selected using a rate allocation rule based on signal-specific rate allocation data, a spectral envelope of the signal and a reference level (EnvE1Max), which is determined based on the spectral envelope of the first signal and is not necessarily included in the bitstream. Further disclosed is a decoding system for reconstructing the audio signals based on the bitstream. In embodiments, the bitstream has a basic layer (BE1), which contains data that enable decoding of the first audio signal, and a spatial layer (Bspatial) facilitating decoding of the further audio signal(s). In embodiments, the encoding system prepares the bitstream subject to a basic-layer bitrate constraint and a total bitrate constraint.
Abstract:
The invention provides a layered audio coding format with a monophonic layer and at least one sound field layer. A plurality of audio signals is decomposed, in accordance with decomposition parameters controlling the quantitative properties of an orthogonal energy-compacting transform, into rotated audio signals. Further, a time-variable gain profile specifying constructively how the rotated audio signals may be processed to attenuate undesired audio content is derived. The monophonic layer may comprise one of the rotated signals and the gain profile. The sound field layer may comprise the rotated signals and the decomposition parameters. In one embodiment, the gain profile comprises a cleaning gain profile with the main purpose of eliminating non-speech components and/or noise. The gain profile may also comprise mutually independent broadband gains. Because signals in the audio coding format can be mixed with a limited computational effort, the invention may advantageously be applied in a tele-conferencing application.