摘要:
Uncorrelated output signals are generated by an audio input signal for transient audio input signals in a multi-channel audio reconstruction in that the audio input signal is mixed with a representation of the audio input signal that is delayed by a delay time such that a first output signal corresponds to the audio input signal, and a second output signal corresponds to the delayed representation of the audio input signal in a first interval, wherein the first output signal of the delayed representation of the audio input signal and the second output signal in a second interval correspond to the audio input signal in a second time interval.
摘要:
A selected channel of a multi-channel signal represented by frames composed from sampling values having a high time resolution is provided that can be encoded with higher quality when a wave form parameter representation representing a wave form of an intermediate resolution representation of the selected channel is derived. The wave form parameter representation with the intermediate resolution can be used to shape a reconstructed channel to retrieve a channel having a signal envelope close to a selected original channel. The time scale on which the shaping is performed is shorter than the time scale of a framewise processing, thus enhancing the quality of the reconstructed channel. On the other hand, the shaping time scale is larger than the time scale of the sampling values, significantly reducing the amount of data needed by the wave form parameter representation.
摘要:
At an audio encoder, cue codes are generated for one or more audio channels, wherein an envelope cue code is generated by characterizing a temporal envelope in an audio channel. At an audio decoder, E transmitted audio channel(s) are decoded to generate C playback audio channels, where C>=Eo1. Received cue codes include an envelope cue code corresponding to a characterized temporal envelope of an audio channel corresponding to the transmitted channel(s). One or more transmitted channel(s) are upmixed to generate one or more upmixed channels. One or more playback channels are synthesized by applying the cue codes to the one or more upmixedchannels, wherein the envelope cue code is applied to an upmixed channel or a synthesized signal to adjust a temporal envelope of the synthesized signal based on the characterized temporal envelope such that the adjusted temporal envelope substantially matches the characterized temporal envelope.
摘要:
A down mixer for downmixing a multi-channel signal having at least two channels, comprises: a weighting value estimator (100) for estimating band-wise weighting values for the at least two channels; a spectral weighter (200) for weighting spectral domain representations of the at least two channels using the band-wise weighting values; a converter (300) for converting weighted spectral domain representations of the at least two channels into time representations of the at least two channels; and a mixer (400) for mixing the time representations of the at least two channels to obtain a downmix signal.
摘要:
A multisignal encoder for encoding at least three audio signals, comprises: a signal preprocessor (100) for individually preprocessing each audio signal to obtain at least three preprocessed audio signals, wherein the preprocessing is performed so that a preprocessed audio signal is whitened with respect to the signal before preprocessing; an adaptive joint signal processor (200) for performing a processing of the at least three preprocessed audio signals to obtain at least three jointly processed signals or at least two jointly processed signals and an unprocessed signal; a signal encoder (300) for encoding each signal to obtain one or more encoded signals; and an output interface (400) for transmitting or storing an encoded multisignal audio signal comprising the one or more encoded signals, side information relating to the preprocessing and side information relating to the processing.
摘要:
An apparatus for decoding an encoded multichannel signal, comprises: a base channel decoder (700) for decoding an encoded base channel to obtain a decoded base channel; a decorrelation filter (800) for filtering at least a portion of the decoded base channel to obtain a filling signal; and a multichannel processor (900) for performing a multichannel processing using a spectral representation of the decoded base channel and a spectral representation of the filling signal, wherein the decorrelation filter (800) is a broad band filter and the multichannel processor (900) is configured to apply a narrow band processing to the spectral representation of the decoded base channel and the spectral representation of the filling signal.
摘要:
A bandwidth extension decoder (500), (600) for providing a bandwidth extended audio signal (532) based on an input audio signal (502) and a parameter signal (504), wherein the parameter signal (504) comprises an indication of an offset frequency and an indication of a power density parameter, comprises: a patch generator (510) configured to generate a bandwidth extension high-frequency signal (512) comprising a high-frequency band, wherein the high-frequency band of the bandwidth extension high-frequency signal (512) is generated based on a frequency shift of a frequency band of the input audio signal (502), wherein the frequency shift is based on the offset frequency, and wherein the patch generator (510) is configured to amplify or attenuate the high-frequency band of the bandwidth extension high-frequency signal (512) by a factor equal to the value of the power density parameter or equal to the reciprocal value of the power density parameter, respectively; a combiner (529) configured to combine the bandwidth extension high-frequency signal (512) and the input audio signal (502) to obtain the bandwidth extended audio signal (532); and an output interface (530) configured to provide the bandwidth extended audio signal (532) .
摘要:
For a bandwidth extension of an audio signal, in a signal spreader the audio signal is temporally spread by a spread factor greater than 1. The temporally spread audio signal is then supplied to a demicator to decimate the temporally spread version by a decimation factor matched to the spread factor. The band generated by this decimation operation is extracted and distorted, and finally combined with the audio signal to obtain a bandwidth extended audio signal. A phase vocoder in the filterbank implementation or transformation implementation may be used for signal spreading.