摘要:
An apparatus for processing a multi-channel signal includes a means for determining a similarity between a first one of two channels and a second one of the two channels. Furthermore, a means for performing a prediction filtering of the spectral coefficients is provided, which is formed to perform a prediction filtering with only a single prediction filter for both channels in case of high similarity between the first and the second channel, and to perform a prediction filtering with two separate prediction filters in case of a dissimilarity between the first and the second channel. With this, an introduction of stereo artifacts and a deterioration of the coding gain in stereo coding techniques are avoided.
摘要:
For analyzing an information signal having a sequence of blocks of information units, wherein a plurality of consecutive blocks of the sequence of blocks represents an information entity, using a sequence of fingerprints for the sequence of blocks, identification results are provided for consecutive fingerprints, wherein an identification result represents an association of a block of information units with a predetermined information entity. Then at least two hypotheses are formed from the identification results for the consecutive fingerprints, wherein a first hypothesis is an assumption for the association of the sequence of blocks with a first information entity, and wherein the second hypothesis is an assumption for the association of the sequence of blocks with the second information entity. Then various hypotheses are examined to obtain an examination result on the basis of which there is then made a statement on the information signal. This achieves a meaningful and reliable time-continuous analysis of an information signal.
摘要:
An input audio signal having an input temporal envelope is converted into an output audio signal having an output temporal envelope. The input temporal envelope of the input audio signal is characterized. The input audio signal is processed to generate a processed audio signal, wherein the processing de-correlates the input audio signal. The processed audio signal is adjusted based on the characterized input temporal envelope to generate the output audio signal, wherein the output temporal envelope substantially matches the input temporal envelope.
摘要:
At an audio encoder, cue codes are generated for one or more audio channels, wherein an envelope cue code is generated by characterizing a temporal envelope in an audio channel. At an audio decoder, E transmitted audio channel(s) are decoded to generate C playback audio channels, where C>E≧1. Received cue codes include an envelope cue code corresponding to a characterized temporal envelope of an audio channel corresponding to the transmitted channel(s). One or more transmitted channel(s) are upmixed to generate one or more upmixed channels. One or more playback channels are synthesized by applying the cue codes to the one or more upmixed channels, wherein the envelope cue code is applied to an upmixed channel or a synthesized signal to adjust a temporal envelope of the synthesized signal based on the characterized temporal envelope such that the adjusted temporal envelope substantially matches the characterized temporal envelope.
摘要:
In a method for concealing an error in an encoded audio signal a set of spectral coefficients is subdivided into at least two sub-bands (14), whereupon the sub-bands are subjected to a re-verse transform (16). A specific prediction is performed (18) for each quasi time signal of a sub-band to obtain an estimated temporal representation for a sub-band of a set of spectral coefficients following the current set. A forward transform (20) of the time signal of each sub-band provides estimated spectral coefficients which can be used (28) instead of erroneous spectral coefficients of a following set of spectral coefficients, e.g. in order to conceal transmission errors. Transforming at the sub-band level provides independence from transform characteristics such as block length, window type and MDCT algorithm while at the same time preserving spectral processing for error concealment. Thus the spectral characteristics of audio signals can also be taken into account during error concealment.
摘要:
For embedding watermark information into an information signal including audio and/or video information, first of all a synchronization sequence with a plurality of synchronization sequence units and a data sequence with a plurality of data sequence units are provided, wherein between the data sequence and the synchronization sequence a time shift is present and wherein a degree of shifting depends on the watermark information. A combination means generates a combination sequence having a plurality of combination sequence units from the synchronization sequence and the data sequence shifted with regard to the synchronization sequence, wherein the combination sequence units are derived from synchronization sequence units and shifted data sequence units. The combination sequence is combined with the information signal in order to embed the watermark information into the information signal. A watermark extractor receives a synchronization sequence correlation peak for every data sequence correlation peak associated with the same and therefore determines the watermark information on the basis of the time interval between the synchronization sequence correlation peak and the data sequence correlation peak in a secure and robust way. The concept is robust, provides a high data rate and is simultaneously flexible with regard to the weighting of synchronization energy and data energy and with regard to the robustness on the one hand and data rate on the other hand, respectively.
摘要:
A selected channel of a multi-channel signal which is represented by frames composed from sampling values having a high time resolution can be encoded with higher quality when a wave form parameter representation representing a wave form of an intermediate resolution representation of the selected channel is derived, the wave form parameter representation including a sequence of intermediate wave form parameters having a time resolution lower than the high time resolution of the sampling values and higher than a time resolution defined by a frame repetition rate. The wave form parameter representation with the intermediate resolution can be used to shape a reconstructed channel to retrieve a channel having a signal envelope close to that one of the selected original channel. The time scale on which the shaping is performed is shorter than the time scale of a framewise processing, thus enhancing the quality of the reconstructed channel. On the other hand, the shaping time scale is larger than the time scale of the sampling values, significantly reducing the amount of data needed by the wave form parameter representation.
摘要:
An apparatus for providing one or more adjusted parameters for a provision of an upmix signal representation on the basis of a downmix signal representation and a parametric side information associated with the downmix signal representation has a parameter adjuster. The parameter adjuster is configured to receive one or more parameters and to provide, on the basis thereof, one or more adjusted parameters. The parameter adjuster is configured to provide the one or more adjusted parameters in dependence on an average value of a plurality of parameter values, such that a distortion of the upmix signal representation caused by the use of non-optimal parameters is reduced at least for parameters deviating from optimal parameters by more than a predetermined deviation.
摘要:
An audio format transcoder for transcoding an input audio signal, the input audio signal having at least two directional audio components. The audio format transcoder including a converter for converting the input audio signal into a converted signal, the converted signal having a converted signal representation and a converted signal direction of arrival. The audio format transcoder further includes a position provider for providing at least two spatial positions of at least two spatial audio sources and a processor for processing the converted signal representation based on the at least two spatial positions to obtain at least two separated audio source measures.
摘要:
An apparatus for providing an upmix signal representation on the basis of a downmix signal representation and an object-related parametric information, which are included in a bitstream representation of an audio content, in independence on a user-specified rendering matrix, the apparatus has a distortion limiter configured to obtain a modified rendering matrix using a linear combination of a user-specified rendering matrix in a target rendering matrix in dependence on a linear combination parameter. The apparatus also has a signal processor configured to obtain the upmix signal representation on the basis of the downmix signal representation and the object-related parametric information using the modified rendering matrix. The apparatus is also configured to evaluate a bitstream element representing the linear combination parameter in order to obtain the linear combination parameter.