摘要:
At an audio encoder, cue codes are generated for one or more audio channels, wherein a combined cue code (e.g., a combined inter-channel correlation (ICC) code) is generated by combining two or more estimated cue codes, each estimated cue code estimated from a group of two or more channels. At an audio decoder, E transmitted audio channel(s) are decoded to generate C playback audio channels. Received cue codes include a combined cue code (e.g., a combined ICC code). One or more transmitted channel(s) are upmixed to generate one or more upmixed channels. One or more playback channels are synthesized by applying the cue codes to the one or more upmixed channels, wherein two or more derived cue codes are derived from the combined cue code, and each derived cue code is applied to generate two or more synthesized channels.
摘要:
An input audio signal having an input temporal envelope is converted into an output audio signal having an output temporal envelope. The input temporal envelope of the input audio signal is characterized. The input audio signal is processed to generate a processed audio signal, wherein the processing de-correlates the input audio signal. The processed audio signal is adjusted based on the characterized input temporal envelope to generate the output audio signal, wherein the output temporal envelope substantially matches the input temporal envelope.
摘要:
The apparatus for constructing a multi-channel output signal using an input signal and parametric side information, the input signal including the first input channel and the second input channel derived from an original multi-channel signal, and the parametric side information describing interrelations between channels of the multi-channel original signal uses base channels for synthesizing first and second output channels on one side of an assumed listener position, which are different from each other. The base channels are different from each other because of a coherence measure. Coherence between the base channels (for example the left and the left surround reconstructed channel) is reduced by calculating a base channel for one of those channels by a combination of the input channels, the combination being determined by the coherence measure. Thus, a high subjective quality of the reconstruction can be obtained because of an approximated original front/back coherence.
摘要:
There is disclosed audio synthesizer (300) for generating a synthesis signal (336) from a downmix signal (324, x) having a number of downmix channels, the synthesis signal (336) having a number of synthesis channels, the downmix signal (324, x) being a downmixed version of an original signal (212) having a number of original channels, the audio synthesizer (300) comprising: a first path (610c') including: a first mixing matrix block (600c) configured for synthesizing a first component (336M') of the synthesis signal according to a first mixing matrix (MM) calculated from: a covariance matrix (CYR) associated to the synthesis signal (212); and a covariance matrix (Cx) associated to the downmix signal (324),
a second path (610c) for synthesizing a second component (336R') of the synthesis signal, wherein the second component (336R') is a residual component, the second path (610c) including: a prototype signal block (612c) configured for upmixing the downmix signal (324) from the number of downmix channels to the number of synthesis channels; a decorrelator (614c) configured for decorrelating the upmixed prototype signal (613c); a second mixing matrix block (618c) configured for synthesizing the second component (336R') of the synthesis signal according to a second mixing matrix (MR) from the decorrelated version (615c) of the downmix signal (324), the second mixing matrix (MR) being a residual mixing matrix,
wherein the audio synthesizer (300) is configured to calculate (618c) the second mixing matrix (MR) from: the residual covariance matrix (Cr) provided by the first mixing matrix block(600c); and an estimate of the covariance matrix of the decorrelated prototype signals (Cy ) obtained from the covariance matrix (Cx) associated to the downmix signal (324),
wherein the audio synthesizer (300) further comprises an adder block (620c) for summing the first component (336M') of the synthesis signal with the second component (336R') of the synthesis signal.