摘要:
A multi-channel audio encoder (10) for encoding a multi-channel audio signal (101), e.g. a 5.1 channel audio signal, into a spatial down-mix (102), e.g. a stereo signal, and associated parameters (104, 105). The encoder (10) comprises first and second units (110, 120). The first unit (110) encodes the multi-channel audio signal (101) into the spatial down-mix (102) and parameters (104). These parameters (104) enable a multi-channel decoder (20) to reconstruct the multi-channel audio signal (203) from the spatial down-mix (102). The second unit (120) generates, from the spatial down-mix (102), parameters (105) that enable the decoder to reconstruct the spatial down-mix (202) from an alternative down-mix (103), e.g. a so-called artistic down-mix that has been manually mixed in a sound studio. In this way, the decoder (20) can efficiently deal with a situation in which an alternative down-mix (103) is received instead of the regular spatial, down-mix (102). In the decoder (20), first the spatial down-mix (202) is reconstructed from the alternative down-mix (103) and the parameters (105). Next, the spatial down-mix (202) is decoded into the multi-channel audio signal (203).
摘要:
A decoder receives (501) a bitstream comprising an encoded mono signal and stereo data. A time scale processor (503) generates a time scaled mono signal. A time-tofrequency processor generates frequency sample blocks of the time scaled signal, the block length being fixed and independent of the time scaling. A parametric stereo decoder (509) generates a stereo signal for the frequency sample blocks and these are converted to the time domain by a frequency-to-time processor (511). A synchronization processor (515) synchronizes the stereo data with the time scaled signal by determining a time association between a parameter value and a frequency sample block. The parameter value and time association is used to determine synchronized stereo parameter values for that and other frequency sample blocks. The invention is particularly suitable for low complexity generation of time scaled stereo signals from MPEG-4 encoded signals.
摘要:
A device (1) for converting a first number (M) of input audio channels into a second, larger number (N) of output audio channels comprises: decorrelation units (3) for decomposing the input audio channels into a set of decorrelated auxiliary channels, at least one upmix unit (4) for combining the decorrelated auxiliary channels into the output audio channels, and at least one pre-processing unit (2) for pre-processing the input audio channels and feeding the pre-processed input audio channels to the decorrelation units (3). The pre-processing unit (2) and the upmix unit (4) are preferably controlled by audio parameters.
摘要:
An audio encoder for encoding a multi-channel audio signal includes an encoder combination module (ECM) for generating a dominant signal part (m) and a residual signal part (s) being a combined representation of first and second audio signals (x1, x2), the dominant and residual signal parts (m, s) being obtained by applying a mathematical procedure to the first and second audio signals (x1, x2), wherein the mathematical procedure involves a first spatial parameter (SP1) including a description of spatial properties of the first and second audio signals (x1, x2), a parameter generator (PG) for generating a first parameter (PS1) set including a second spatial parameter (SP2), and a second parameter (PS2) set including a third spatial parameter (SP3), and an output generator for generating an encoded output signal having a first output part (OP1) including the dominant signal part (m) and the first parameter set (PS1), and a second output part (OP2) including the residual signal part (s) and the second parameter set (PS2).
摘要:
A method and a device for processing a stereo signal obtained from an encoder, which codes an N-channel audio signal into spatial parameters (P) and a stereo down-mix comprising first and second stereo signals (L 0 , R 0 ). A first signal and a third signal are added in order to obtain a first output signal (L 0w ), wherein the first signal QL 0wL ) comprises the first stereo signal (L 0 ) modified by a first complex function (g 1 ), and the third signal (L 0wR ) comprises the second stereo signal (R 0 ) modified by a third complex function (g 3 ). A second signal and a fourth signal are added to obtain a second output signal (R 0w ). The fourth signal (R 0wR ) comprises the second stereo signal (R 0 ) modified by a fourth complex function (g 4 ), and the second signal (R 0wL ) comprises the first stereo signal (L 0 ) modified by a second complex function (g 2 ). The complex functions (g 1 ,g 2 ,g 3 ,g 4 ) are functions of the spatial parameters (P) and are chosen such that an energy value of the difference (L 0wL -P 0wL ) between the first signal and the second signal is larger than or equal to the energy value of the sum (L 0wL +R 0wL ) of the first and the second signal and the energy value of the difference (R 0wR -L 0wR ) between the fourth signal and the third signal is larger than or equal to the energy value of the sum (R 0wR +L 0wR ) of the fourth signal and the third signal.
摘要:
Synthesizing an output audio signal is provided on the basis of an input audio signal, the input audio signal comprising a plurality of input sub-band signals, wherein at least one input sub-band signal is transformed (T) from the sub-band domain to the frequency domain to obtain at least one respective transformed signal, wherein the at least one input sub-band signal is delayed and transformed (D, T) to obtain at least one respective transformed delayed signal, wherein at least two processed signals are derived (P) from the at least one transformed signal and the at least one transformed delayed signal, wherein the processed signals are inverse transformed (T-1) from the frequency domain to the sub-band domain to obtain respective processed sub-band signals, and wherein the output audio signal is synthesized from the processed sub-band signals.
摘要:
In the method of coding the audio signal, the values of first parameters (P1,1), which represent aspects of the audio signal at a first instant (ti), are calcul ated to obtain first calculated values (Al,i). The values of second parameters P2,i), which represent the aspects of the audio signal at a second, later, instant (t2), are calculated to obtain the second calculated values (A2,i). The number of the first parameters (Pl,i) and the number of the second parameters (P2,i) differ. A subset (SUS2,i) of the second parameters (P2,i) is associated with a particular portion (SFRAi) of a frequency range (FR) of the audio signal This frequency range (FR) of the audio signal is preferably selected to cover all the f requencies present in the audio signal. The values (A2,i) of the subset (SUS2,i) of the second parameters (P2,i) are coded based on a difference of this subset (SUS2,i) and a subset (SUS1,i) of the first calculated value(s) (Al,i) associate d with substantially this same particular portion (SFRAi) of the frequency range (FR). Thus the differentially coded values (7) of the second parameters (P2,i) are obtained by coding the difference of the values of second parameters (P2,i and first parameters (P1,i) which are associated with substantially the same frequency subrange (SFRAi). This allows to differential code the parameters (Pl,I P2,i) even if the number of the parameters changes in time.
摘要:
The invention relates to a linking unit (100), a parametric encoder (400) and a method for generating linking information L indicating components of consecutive extended segments sp and sc which may be linked together in order to form a sinusoidal track. The segments sp and sc approximate consecutive segments of a sinusoidal audio or speech signal s. The linking unit comprises a calculating unit (120) for generating a similarity matrix S(m,n) in response to received sinusoidal code data and an evaluating unit (140) for receiving and evaluating said similarity matrix S in order to generate said linking information by selecting those pairs of components m,n the similarity of which is maximal. According to the invention the calculating unit (120) is adapted to calculate the similarity matrix S by additionally considering information about the phase consistency between the components of the extended previous segment sp and the extended current segment sc. In that way the selection of components suitable for being linked together is improved resulting in the definition of correct tracks.
摘要:
A parametric stereo upmix apparatus generates left and right signals from a mono downmix signal based on spatial parameters. The parametric stereo upmix includes a predictor configured to predict a difference signal including a difference between the left and right signals based on the mono downmix signal scaled with a prediction coefficient. The prediction coefficient is derived from the spatial parameters. The parametric stereo upmix apparatus further includes an arithmetic unit configured to derive the left and right signals based on a sum and a difference of the mono downmix signal and the difference signal.
摘要:
A method and a device for processing a stereo signal obtained from an encoder, which codes an N-channel audio signal into spatial parameters (P) and a stereo down-mix comprising first and second stereo signals (L 0 , R 0 ). A first signal and a third signal are added in order to obtain a first output signal (L 0w ), wherein the first signal QL 0wL ) comprises the first stereo signal (L 0 ) modified by a first complex function (g 1 ), and the third signal (L 0wR ) comprises the second stereo signal (R 0 ) modified by a third complex function (g 3 ). A second signal and a fourth signal are added to obtain a second output signal (R 0w ). The fourth signal (R 0wR ) comprises the second stereo signal (R 0 ) modified by a fourth complex function (g 4 ), and the second signal (R 0wL ) comprises the first stereo signal (L 0 ) modified by a second complex function (g 2 ). The complex functions (g 1 ,g 2 ,g 3 ,g 4 ) are functions of the spatial parameters (P) and are chosen such that an energy value of the difference (L 0wL -P 0wL ) between the first signal and the second signal is larger than or equal to the energy value of the sum (L 0wL +R 0wL ) of the first and the second signal and the energy value of the difference (R 0wR -L 0wR ) between the fourth signal and the third signal is larger than or equal to the energy value of the sum (R 0wR +L 0wR ) of the fourth signal and the third signal.