摘要:
An inventive method for introducing information into a data stream including data about spectral values representing a short-term spectrum of an audio signal first performs a processing of the data stream to obtain the spectral values of the short-term spectrum of the audio signal. Apart from that, the information to be introduced are combined with a spread sequence to obtain a spread information signal, whereupon a spectral representation of the spread information is generated which will then be weighted with an established psychoacoustic maskable noise energy to generate a weighted information signal, wherein the energy of the introduced information is substantially equal to or below the psychoacoustic masking threshold. The weighted information signal and the spectral values of the short-term spectrum of the audio signal will then be summed and afterwards processed again to obtain a processed data stream including both audio information and information to be introduced. By the fact that the information to be introduced are introduced into the data stream without changing to the time domain, the block rastering underlying the short-term spectrum will not be touched, so that introducing a watermark will not lead to tandem encoding effects.
摘要:
An input multi-channel representation is converted into a different output multi-channel representation of a spatial audio signal, in that an intermediate representation of the spatial audio signal is derived, the intermediate representation having direction parameters indicating a direction of origin of a portion of the spatial audio signal; and in that the output multi-channel representation of the spatial audio signal is generated using the intermediate representation of the spatial audio signal.
摘要:
According to an inventive scheme for introducing a watermark into an information signal, the information signal is at first transferred from a time representation to a spectral/modulation spectral representation). The information signal is then manipulated in the spectral/modulation spectral representation in dependence on the watermark to be introduced to obtain a modified spectral/modulation spectral representation, and subsequently an information signal provided with a watermark is formed based on the modified spectral/modulation spectral representation. An advantage is that, due to the fact that the watermark is embedded and/or derived in the spectral/modulation spectral representation or range, traditional correlation attacks as are used in watermark methods based on a spread-band modulation cannot succeed easily.
摘要:
The present invention is based on the finding that a reconstructed output channel, reconstructed with a multi-channel reconstructor using at least one downmix channel derived by downmixing a plurality of original channels and using a parameter representation including additional information on a temporal fine structure of an original channel can be reconstructed efficiently with high quality, when a generator for generating a direct signal component and a diffuse signal component based on the downmix channel is used. The quality can be essentially enhanced, if only the direct signal component is modified such that the temporal fine structure of the reconstructed output channel is fitting a desired temporal fine structure, indicated by the additional information on the temporal fine structure transmitted.
摘要:
A time-discrete audio signal is processed to provide a quantization block with quantized spectral values. Furthermore, an integer spectral representation is generated from the time-discrete audio signal using an integer transform algorithm. The quantization block having been generated using a psychoacoustic model is inversely quantized and rounded to then form a difference between the integer spectral values and the inversely quantized rounded spectral values. The quantization block alone provides a lossy psychoacoustically coded/decoded audio signal after the decoding, whereas the quantization block, together with the combination block, provides a lossless or almost lossless coded and again decoded audio signal in the decoding. By generating the differential signal in the frequency domain, a simpler coder/decoder structure results.
摘要:
A selected channel of a multi-channel signal which is represented by frames composed from sampling values having a high time resolution can be encoded with higher quality when a wave form parameter representation representing a wave form of an intermediate resolution representation of the selected channel is derived, the wave form parameter representation including a sequence of intermediate wave form parameters having a time resolution lower than the high time resolution of the sampling values and higher than a time resolution defined by a frame repetition rate. The wave form parameter representation with the intermediate resolution can be used to shape a reconstructed channel to retrieve a channel having a signal envelope close to that one of the selected original channel. The time scale on which the shaping is performed is shorter than the time scale of a framewise processing, thus enhancing the quality of the reconstructed channel. On the other hand, the shaping time scale is larger than the time scale of the sampling values, significantly reducing the amount of data needed by the wave form parameter representation.
摘要:
For a multi-channel audio signal, parametric coding is applied to different subsets of audio input channels for different frequency regions. For example, for a 5.1 surround sound signal having five regular channels and one low-frequency (LFE) channel, binaural cue coding (BCC) can be applied to all six audio channels for sub-bands at or below a specified cut-off frequency, but to only five audio channels (excluding the LFE channel) for sub-bands above the cut-off frequency. Such frequency-based coding of channels can reduce the encoding and decoding processing loads and/or size of the encoded audio bitstream relative to parametric coding techniques that are applied to all input channels over the entire frequency range.
摘要:
An apparatus for analyzing an analysis time signal that has been generated from encoding and decoding an original time signal according to an encoding algorithm first, wherein first the encoding block raster underlying the analysis time signal used by the encoding algorithm is determined. Thereupon, the analysis time signal will be converted from its timely representation comprising a plurality of analysis spectral coefficients, to a spectral representation by using the established encoding block raster. Then, at least two analysis spectral coefficients or at least two spectral coefficients derived from the analysis spectral coefficients by multiplication of an encoding amplification factor or by multiplication with a compression function are grouped. Then, the greatest common divisor of the analysis spectral coefficients or the spectral coefficients derived from the analysis spectral coefficients will be calculated, corresponding to the quantization step width used when quantizing the encoding algorithm or an integer multiple of it. Then, in the case of an audio signal, the scale factor can easily be established for this group of spectral coefficients, i.e. for a scale factor band, from the quantization step width. Thus, all parameters used for the quantization of the original time signal are known, so that for quantizing the analysis time signal no longer full iteration loops have to be performed, which are, on the one hand, very computing time intensive and, on the other hand, introduce tandem encoding distortions.
摘要:
An input multi-channel representation is converted into a different output multi-channel representation of a spatial audio signal, in that an intermediate representation of the spatial audio signal is derived, the intermediate representation having direction parameters indicating a direction of origin of a portion of the spatial audio signal; and in that the output multi-channel representation of the spatial audio signal is generated using the intermediate representation of the spatial audio signal.
摘要:
An apparatus for providing one or more adjusted parameters for a provision of an upmix signal representation on the basis of a downmix signal representation and an object-related parametric information includes a parameter adjuster. The parameter adjuster is configured to receive one or more input parameters and to provide, on the basis thereof, one or more adjusted parameters. The parameter adjuster is configured to provide the one or more adjusted parameters in dependence on the one or more input parameters and the object-related parametric information, such that a distortion of the upmix signal representation caused by the use of non-optimal parameters is reduced at least for input parameters deviating from optimal parameters by more than a predetermined deviation.