摘要:
Parameters being a measure for a characteristic of a channel or of a pair of channels, wherein the parameter is a measure for a characteristic of the channel or of the pair of channels with respect to another channel of a multi-channel signal can be quantized more efficiently using a quantization rule that is generated based on a relation of an energy measure of the channel or the pair of channels and an energy measure of the multi-channel signal. With generation of the quantization rule taking into account a psycho acoustic approach, the size of an encoded representation of the multi-channel signal can be decreased by coarser quantization without significantly disturbing the perceptual quality of the multi-channel signal when reconstructed from the encoded representation.
摘要:
In order to analyse an information signal, a significant short-time spectrum is extracted from the information signal. The extraction device (16) is embodied in such a way as to extract the short-time spectra which come closer to a specific characteristic than other short-time spectra of the information signal. The extracted short-time spectra are then decomposed (18) into component signals, by ICA analysis, a component signal spectrum representing a profile spectrum of a sound source which generates a sound corresponding to the required characteristic. An amplitude envelope is calculated (20) for each profile spectrum from a series of short-time spectra of the information signal and from the determined profile spectra, said envelope indicating how the profile spectrum of a sound source generally varies over time. The profile spectra and associated amplitude envelopes describe the information signal that can be further evaluated, e.g. for the purposes of a transcription in the case of a music signal.
摘要:
An inventive method for introducing information into a data stream including data about spectral values representing a short-term spectrum of an audio signal first performs a processing of the data stream to obtain the spectral values of the short-term spectrum of the audio signal. Apart from that, the information to be introduced are combined with a spread sequence to obtain a spread information signal, whereupon a spectral representation of the spread information is generated which will then be weighted with an established psychoacoustic maskable noise energy to generate a weighted information signal, wherein the energy of the introduced information is substantially equal to or below the psychoacoustic masking threshold. The weighted information signal and the spectral values of the short-term spectrum of the audio signal will then be summed and afterwards processed again to obtain a processed data stream including both audio information and information to be introduced. By the fact that the information to be introduced are introduced into the data stream without changing to the time domain, the block rastering underlying the short-term spectrum will not be touched, so that introducing a watermark will not lead to tandem encoding effects.
摘要:
A method for error concealment in an encoded audio signal, whereby a set of spectral coefficients are divided into at least two sub-bands(14), whereupon said sub-bands undergo reverse transformation (16). A specific prediction is performed (18) for each quasi- time signal of a sub-band in order to obtain an estimated temporal representation for a sub-band of a set of spectral coefficients following on from a real set. Forward transformation (20) of the time signal of each sub-band provides estimated spectral coefficients that can be used (28) instead of defective spectral coefficients of a subsequent set of spectral coefficients in order to conceal transmission errors, for example. Sub-band transformation provides independence from transformation characteristics such as frame length, window type or MDCT algorithm, while at the same time ensuring that spectral processing is maintained for error concealment, whereby the spectral characteristics of audio signals can also be taken into account during said error concealment.
摘要:
The invention makes it possible to combine a scaleable audio coder with TNS technology. According to the inventive method for encoding time signals (x1) sampled in a first sampling rate, second time signals (x2) with a sampling rate smaller than the first sampling rate are generated (12). The second time signals (x2) are then encoded (14) according to a first coding algorithm, and written into a bit stream (xAUS) (16). The encoded second time signals (x2c) are then decoded (14) again and are transformed (23, 24) into the frequency range, as are the first time signals. TNS prediction coefficients are then calculated (25) from a spectral representation of the first time signals (X1). The transformed output signal (X2cd) of the coder/decoder (14) with the first coding algorithm and the spectral representation (X1) of the first time signal are subjected to a prediction of the frequency (27) in order to obtain spectral residual values for both signals using the prediction coefficients calculated on the basis of the first time signals alone. These two signals are evaluated against each other (26, 28). The evaluated spectral residual values (Xb) are then encoded by means of a second coding algorithm in order to obtain coded evaluated spectral residual values (Xcb). These evaluated spectral residual values are written into the bit stream (xAUS) in addition to side information with the prediction coefficients.
摘要:
The invention relates to a method for signalling a noise substitution during audio signal coding. According to said method, the audio signal is first transformed in the frequency range to obtain spectral values. The spectral values are subsequently grouped to form spectral value groups. On the basis of a detection whether a group of spectral values is a noise group or not, a coding table is allocated to a non-noise group or a tonal group by means of a coding table number for redundancy coding of the same. If a group is a noise group it is allocated an additional coding table number which does not refer to a coding table in order to signal that this group is a noise group and that it must not be redundancy coded. By signalling noise substitution by means of a Huffman-code table number for noise groups of spectral values which are for instance scale factor band sections and which must not be redundancy coded, an opportunity is provided for implementing availability of a noise substitution in a scale factor band in the bit flow syntax of the MPEG-2 Advanced Audio coding Standard, without intervening in the basic coding structure and without having to touch the structure of the existing bit flow syntax.
摘要:
The invention relates to a method for coding or de-coding an audio signal combining the advantages of TNS processing and noise substitution. A time discrete audio signal is initially transformed in a frequency range in order to obtain spectral value of the temporal audio signal. A prediction of the spectral values in relation to frequency is subsequently made in order to enable spectral residual values. Areas within the spectral values encompassing spectral values with noise properties are detected . The spectral residual values are noise substituted in the noise areas, whereupon data relating to the noise areas and noise substitution are incorporated into side information pertaining to a coded audio signal.
摘要:
Audio decoder for decoding encoded audio data, comprising: an input interface (1100) for receiving the encoded audio data, the encoded audio data comprising a plurality of encoded channels or a plurality of encoded objects or compress metadata related to the plurality of objects; a core decoder (1300) for decoding the plurality of encoded channels and the plurality of encoded objects; a metadata decompressor (1400) for decompressing the compressed metadata; an object processor (1200) for processing the plurality of decoded objects using the decompressed metadata to obtain a number of output channels (1205) comprising audio data from the objects and the decoded channels; and a post-processor (1700) for converting the number of output channels (1205) into an output format, wherein the audio decoder is configured to bypass the object processor and to feed a plurality of decoded channels into the post-processor (1700), when the encoded audio data does not contain any audio objects and to feed the plurality of decoded objects and the plurality of decoded channels into the object processor (1200), when the encoded audio data comprises encoded channels and encoded objects..