摘要:
An audio signal decoder for providing a decoded multi-channel audio signal representation on the basis of an encoded multi-channel audio signal representation has a time warp decoder configured to selectively use individual audio channel specific time warp contours or a joint multi-channel time warp contour for a reconstruction of a plurality of audio channels represented by the encoded multi-channel audio signal representation. An audio signal encoder for providing an encoded representation of a multi-channel audio signal has an encoded audio representation provider configured to selectively provide an audio representation having a common time warp contour information, commonly associated with a plurality of audio channels of the multi-channel audio signal, or an encoded audio representation having individual time warp contour information, individually associated with the different audio channels of the plurality of audio channels, in dependence on an information describing a similarity or difference between time warp contours associated with the audio channels of the plurality of audio channels.
摘要:
Apparatus for converting an audio signal into a parameterized representation, has a signal analyzer for analyzing a portion of the audio signal to obtain an analysis result; a band pass estimator for estimating information of a plurality of band pass filters based on the analysis result, wherein the information on the plurality of band pass filters has information on a filter shape for the portion of the audio signal, wherein the band width of a band pass filter is different over an audio spectrum and depends on the center frequency of the band pass filter; a modulation estimator for estimating an amplitude modulation or a frequency modulation or a phase modulation for each band of the plurality of band pass filters for the portion of the audio signal using the information on the plurality of band pass filters; and an output interface for transmitting, storing or modifying information on the amplitude modulation, information on the frequency modulation or phase modulation or the information on the plurality of band pass filters for the portion of the audio signal.
摘要:
A processed representation of an audio signal having a sequence of frames is generated by sampling the audio signal within first and second frames of the sequence of frames, the second frame following the first frame, the sampling using information on a pitch contour of the first and second frames to derive a first sampled representation. The audio signal is sampled within the second and third frames, the third frame following the second frame in the sequence of frames. The sampling uses the information on the pitch contour of the second frame and information on a pitch contour of the third frame to derive a second sampled representation. A first scaling window is derived for the first sampled representation, and a second scaling window is derived for the second sampled representation, the scaling windows depending on the samplings applied to derive the first sampled representations or the second sampled representation.
摘要:
At an audio encoder, cue codes are generated for one or more audio channels, wherein an envelope cue code is generated by characterizing a temporal envelope in an audio channel. At an audio decoder, E transmitted audio channel(s) are decoded to generate C playback audio channels, where C>E≧1. Received cue codes include an envelope cue code corresponding to a characterized temporal envelope of an audio channel corresponding to the transmitted channel(s). One or more transmitted channel(s) are upmixed to generate one or more upmixed channels. One or more playback channels are synthesized by applying the cue codes to the one or more upmixed channels, wherein the envelope cue code is applied to an upmixed channel or a synthesized signal to adjust a temporal envelope of the synthesized signal based on the characterized temporal envelope such that the adjusted temporal envelope substantially matches the characterized temporal envelope.
摘要:
According to an inventive scheme for introducing a watermark into an information signal, the information signal is at first transferred from a time representation to a spectral/modulation spectral representation). The information signal is then manipulated in the spectral/modulation spectral representation in dependence on the watermark to be introduced to obtain a modified spectral/modulation spectral representation, and subsequently an information signal provided with a watermark is formed based on the modified spectral/modulation spectral representation. An advantage is that, due to the fact that the watermark is embedded and/or derived in the spectral/modulation spectral representation or range, traditional correlation attacks as are used in watermark methods based on a spread-band modulation cannot succeed easily.
摘要:
In the transition into the logarithmic range, not the entire bit width of the result linearly dependent upon the square of the value must be considered. Rather, it is possible to scale the result of a value with x bits such that a representation with less than x bits of the result is sufficient to receive the logarithmic representation based thereon. The effect of the scaling factor on the resulting logarithmic representation may be compensated for by adding or subtracting a correction value received by the logarithm function applied to the scaling factor to or from the scaled logarithmic representation without any loss of dynamics. This way, a method and an apparatus for creating a representation of a result linearly dependent upon a square of a value are provided so that the calculation is simple and/or possible with little hardware expenditure.
摘要:
The present invention is based on the finding that a reconstructed output channel, reconstructed with a multi-channel reconstructor using at least one downmix channel derived by downmixing a plurality of original channels and using a parameter representation including additional information on a temporal fine structure of an original channel can be reconstructed efficiently with high quality, when a generator for generating a direct signal component and a diffuse signal component based on the downmix channel is used. The quality can be essentially enhanced, if only the direct signal component is modified such that the temporal fine structure of the reconstructed output channel is fitting a desired temporal fine structure, indicated by the additional information on the temporal fine structure transmitted.
摘要:
A selected channel of a multi-channel signal which is represented by frames composed from sampling values having a high time resolution can be encoded with higher quality when a wave form parameter representation representing a wave form of an intermediate resolution representation of the selected channel is derived, the wave form parameter representation including a sequence of intermediate wave form parameters having a time resolution lower than the high time resolution of the sampling values and higher than a time resolution defined by a frame repetition rate. The wave form parameter representation with the intermediate resolution can be used to shape a reconstructed channel to retrieve a channel having a signal envelope close to that one of the selected original channel. The time scale on which the shaping is performed is shorter than the time scale of a framewise processing, thus enhancing the quality of the reconstructed channel. On the other hand, the shaping time scale is larger than the time scale of the sampling values, significantly reducing the amount of data needed by the wave form parameter representation.
摘要:
An audio signal decoder has a time warp contour calculator, a time warp contour data rescaler and a warp decoder. The time warp contour calculator is configured to generate time warp contour data repeatedly restarting from a predetermined time warp contour start value, based on time warp contour evolution information describing a temporal evolution of the time warp contour. The time warp contour data rescaler is configured to rescale at least a portion of the time warp contour data such that a discontinuity at a restart is avoided, reduced or eliminated in a rescaled version of the time warp contour. The warp decoder is configured to provide the decoded audio signal representation, based on an encoded audio signal representation and using the rescaled version of the time warp contour.
摘要:
An apparatus for determining a spatial output multi-channel audio signal based on an input audio signal and an input parameter. The apparatus includes a decomposer for decomposing the input audio signal based on the input parameter to obtain a first decomposed signal and a second decomposed signal different from each other. Furthermore, the apparatus includes a renderer for rendering the first decomposed signal to obtain a first rendered signal having a first semantic property and for rendering the second decomposed signal to obtain a second rendered signal having a second semantic property being different from the first semantic property. The apparatus comprises a processor for processing the first rendered signal and the second rendered signal to obtain the spatial output multi-channel audio signal.