摘要:
Representations of spatial audio scenes using higher-order Ambisonics HOA technology typically require a large number of coefficients per time instant. This data rate is too high for most practical applications that require real-time transmission of audio signals. According to the invention, the compression is carried out in spatial domain instead of HOA domain. The (N+1)2 input HOA coefficients are transformed into (N+1)2 equivalent signals in spatial domain, and the resulting (N+1)2 time-domain signals are input to a bank of parallel perceptual codecs. At decoder side, the individual spatial-domain signals are decoded, and the spatial-domain coefficients are transformed back into HOA domain in order to recover the original HOA representation.
摘要:
Representations of spatial audio scenes using higher-order Ambisonics HOA technology typically require a large number of coefficients per time instant. This data rate is too high for most practical applications that require real-time transmission of audio signals. According to the invention, the compression is carried out in spatial domain instead of HOA domain. The (N+1)2 input HOA coefficients are transformed into (N+1)2 equivalent signals in spatial domain, and the resulting (N+1)2 time-domain signals are input to a bank of parallel perceptual codecs. At decoder side, the individual spatial-domain signals are decoded, and the spatial-domain coefficients are transformed back into HOA domain in order to recover the original HOA representation.
摘要:
The invention is related to a data structure for Higher Order Ambisonics HOA audio data, which data structure includes 2D or 3D spatial audio content data for one or more different HOA audio data stream descriptions. The HOA audio data can have on order of greater than ‘3’, and the data structure in addition can include single audio signal source data and/or microphone array audio data from fixed or time-varying spatial positions.
摘要:
The invention is related to a data structure for Higher Order Ambisonics HOA audio data, which data structure includes 2D or 3D spatial audio content data for one or more different HOA audio data stream descriptions. The HOA audio data can have on order of greater than ‘3’, and the data structure in addition can include single audio signal source data and/or microphone array audio data from fixed or time-varying spatial positions.
摘要:
In lossy based lossless coding a PCM audio signal passes through a lossy encoder to a lossy decoder. The lossy encoder provides a lossy bit stream. The difference signal between the PCM signal and the lossy decoder output is lossless encoded, providing an extension bit stream. The invention facilitates enhancing a lossy perceptual audio encoding/decoding by an extension that enables mathematically exact reproduction of the original waveform using enhanced de-correlation, and provides additional data for reconstructing at decoder site an intermediate-quality audio signal. The lossless extension can be used to extend the widely used mp3 encoding/decoding to lossless encoding/decoding and superior quality mp3 encoding/de-coding.
摘要:
In lossy based lossless coding a PCM audio signal passes through a lossy encoder to a lossy decoder. The lossy encoder provides a lossy bit stream. The difference signal between the PCM signal and the lossy decoder output is lossless encoded, providing an extension bit stream. The invention facilitates enhancing a lossy perceptual audio encoding/decoding by an extension that enables mathematically exact reproduction of the original waveform using enhanced de-correlation, and provides additional data for reconstructing at decoder site an intermediate-quality audio signal. The lossless extension can be used to extend the widely used mp3 encoding/decoding to lossless encoding/decoding and superior quality mp3 encoding/de-coding.
摘要:
In lossy based lossless coding a PCM audio signal passes through a lossy encoder to a lossy decoder. The lossy encoder provides a lossy bit stream. The lossy decoder also provides side information that is used to control the coefficients of a prediction filter that de-correlates the difference signal between the PCM signal and the lossy decoder output. The de-correlated difference signal is lossless encoded, providing an extension bit stream. Instead of, or in addition to, de-correlating in the time domain, a de-correlation in the frequency domain using spectral whitening can be performed. The lossy encoded bit stream together with the lossless encoded extension bit stream form a lossless encoded bitstream. The invention facilitates enhancing a lossy perceptual audio encoding/decoding by an extension that enables mathematically exact reproduction of the original waveform, and provides additional data for reconstructing at decoder site an intermediate-quality audio signal. The lossless extension can be used to extend the widely used mp3 encoding/decoding to lossless encoding/decoding and superior quality mp3 encoding/decoding.
摘要:
Lossless compression algorithms can only exploit redundancies of the original audio signal to reduce the data rate, but not irrelevancies as identified by psycho-acoustics. Lossless audio coding schemes apply a filter or transform for decorrelation and then encode the transformed signal. The encoded bit stream comprises the parameters of the transform or filter, and the lossless representation of the transformed signal. However, in case of lossy based lossless coding the additional amount of information exceeds the amount of data for the base layer by a multiple of the base layer data amount. Therefore the additional data cannot be packed completely into the base layer data stream e.g. as ancillary data. The at least two data streams resulting from the combination of lossy coding format with a lossless coding extension are the base layer containing the lossy coding information and the enhancement data stream for rebuilding the mathematically lossless original input signal. Furthermore several intermediate quality layers are possible. However, these data streams are not independent from each other Every higher layer depends on the lower layers and can only be reasonably decoded in combination with these lower layers. According to the invention, a special combination of one-time header information with repeated header information in a block structure is used, which kind of combination depends on the type of application. Assignment information data identify the different parts or layers of the lossless format belonging to one input signal. Synchronisation data are used to combine the different data streams or parts or layers to a single lossless or intermediate output signal. These features are used in a file format and in a streaming format.
摘要:
Lossless audio coding performs decorrelation and encodes the transformed signal. The encoded bit stream comprises de-correlation parameters and the lossless representation data of the transformed signal. However, in the case of lossy based lossless coding, the additional amount of information exceeds the base layer amount of data. Therefore the additional data cannot be packed completely into the base layer e.g. as ancillary data. The data streams resulting from the combination of lossy coding format with a lossless coding extension are the base layer containing the lossy coding information and the enhancement data stream for rebuilding the mathematically lossless original input signal. Every higher layer depends on the lower layers and can only be reasonably decoded in combination with these lower layers. According to the invention, a special combination of one-time header information with repeated header information in a block structure is used. Assignment information data identify the different layers.
摘要:
In lossy based lossless coding a PCM audio signal passes through a lossy encoder to a lossy decoder. The lossy encoder provides a lossy bit stream. The lossy decoder also provides side information that is used to control the coefficients of a prediction filter that de-correlates the difference signal between the PCM signal and the lossy decoder output. The de-correlated difference signal is lossless encoded, providing an extension bit stream. Instead of, or in addition to, de-correlating in the time domain, a de-correlation in the frequency domain using spectral whitening can be performed. The lossy encoded bit stream together with the lossless encoded extension bit stream form a lossless encoded bitstream. The invention facilitates enhancing a lossy perceptual audio encoding/decoding by an extension that enables mathematically exact reproduction of the original waveform, and provides additional data for reconstructing at decoder site an intermediate-quality audio signal. The lossless extension can be used to extend the widely used mp3 encoding/decoding to lossless encoding/decoding and superior quality mp3 encoding/decoding.