Abstract:
The present document relates to the technical field of audio coding, decoding and processing. It specifically relates to methods of recovering high frequency content of an audio signal from low frequency content of the same audio signal in an efficient manner. A method for determining a first banded tonality value for a first frequency subband of an audio signal is described. The first banded tonality value is used for approximating a high frequency component of the audio signal based on a low frequency component of the audio signal. The method comprises determining a set of transform coefficients in a corresponding set of frequency bins based on a block of samples of the audio signal; determining a set of bin tonality values for the set of frequency bins using the set of transform coefficients, respectively; and combining a first subset of two or more of the set of bin tonality values for two or more corresponding adjacent frequency bins of the set of frequency bins lying within the first frequency subband, thereby yielding the first banded tonality value for the first frequency subband.
Abstract:
Many portable playback devices cannot decode and playback encoded audio content having wide bandwidth and wide dynamic range with consistent loudness and intelligibility unless the encoded audio content has been prepared specially for these devices. This problem can be overcome by including with the encoded content some metadata that specifies a suitable dynamic range compression profile by either absolute values or differential values relative to another known compression profile. A playback device may also adaptively apply gain and limiting to the playback audio. Implementations in encoders, in transcoders and in decoders are disclosed.
Abstract:
Many portable playback devices cannot decode and playback encoded audio content having wide bandwidth and wide dynamic range with consistent loudness and intelligibility unless the encoded audio content has been prepared specially for these devices. This problem can be overcome by including with the encoded content some metadata that specifies a suitable dynamic range compression profile by either absolute values or differential values relative to another known compression profile. A playback device may also adaptively apply gain and limiting to the playback audio. Implementations in encoders, in transcoders and in decoders are disclosed.
Abstract:
The present document relates to methods and apparatus for audio coding. In particular, the present document relates to methods and apparatus for enhanced block switching and/or bit allocation in audio coding of transient-tonal signals. A method of encoding samples of an audio signal comprises determining a first measure indicative of transient characteristics of the audio signal, determining a second measure indicative of tonal characteristics of the audio signal, selecting a transform length for the audio signal on the basis of the first measure and the second measure, and applying a time-frequency transform to a block of samples of the audio signal in accordance with the selected transform length, to thereby obtain a block of frequency coefficients corresponding to the block of samples of the audio signal. Another method of encoding samples of an audio signal comprises applying a time-frequency transform to the audio signal in accordance with a selected transform length, to thereby obtain a sequence of blocks of frequency coefficients, wherein each block of frequency coefficients among said sequence corresponds to a respective block of samples of the audio signal, determining a measure of tonal characteristics for a frequency band of the audio signal based on the blocks of frequency components among said sequence, selecting, for the blocks of frequency coefficients among said sequence, a quantization step size for the frequency coefficients in said frequency band on the basis of said measure of tonal characteristics, and quantizing, for the blocks of frequency coefficients among said sequence, the frequency coefficients in said frequency band in accordance with the selected quantization step size.
Abstract:
The present document describes a method (800) for allocating bits to a frame (301) of a sequence of frames (301) to yield a bitstream having a constant average bitrate, wherein the frame (301) comprises audio data and metadata. The method (800) comprises maintaining (801) an overall bit reservoir (100) and maintaining (802) a virtual bit reservoir (510) being a subset of the overall bit reservoir (100), such that bits for the metadata of the frame (301) are allocated from the virtual bit reservoir (510) and such that bits for the audio data of the frame (301) are allocated from the overall bit reservoir (100).
Abstract:
Embodiments are directed to a companding method and system for reducing coding noise in an audio codec. A compression process reduces an original dynamic range of an initial audio signal through a compression process that divides the initial audio signal into a plurality of segments using a defined window shape, calculates a wideband gain in the frequency domain using a non-energy based average of frequency domain samples of the initial audio signal, and applies individual gain values to amplify segments of relatively low intensity and attenuate segments of relatively high intensity. The compressed audio signal is then expanded back to substantially the original dynamic range that applies inverse gain values to amplify segments of relatively high intensity and attenuating segments of relatively low intensity. A QMF filterbank is used to analyze the initial audio signal to obtain a frequency domain representation.
Abstract:
Many portable playback devices cannot decode and playback encoded audio content having wide bandwidth and wide dynamic range with consistent loudness and intelligibility unless the encoded audio content has been prepared specially for these devices. This problem can be overcome by including with the encoded content some metadata that specifies a suitable dynamic range compression profile by either absolute values or differential values relative to another known compression profile. A playback device may also adaptively apply gain and limiting to the playback audio. Implementations in encoders, in transcoders and in decoders are disclosed.
Abstract:
Described are methods of processing an audio signal for packet loss concealment. The audio signal comprises a sequence of frames, each frame containing representations of a plurality of audio channels and reconstruction parameters for upmixing the plurality of audio channels to a predetermined channel format. One method includes: receiving the audio signal; and generating a reconstructed audio signal in the predefined channel format based on the received audio signal. Generating the reconstructed audio signal comprises: determining whether at least one frame of the audio signal has been lost; and if a number of consecutively lost frames exceeds a first threshold, fading the reconstructed audio signal to a predefined spatial configuration. Also described is a method of encoding an audio signal. Yet further described are apparatus for carrying out the methods, as well as corresponding programs and computer-readable storage media.
Abstract:
A method for determining mantissa bit allocation of audio data values of frequency domain audio data to be encoded. The allocation method includes a step of determining masking values for the audio data values, including by performing adaptive low frequency compensation on the audio data of each frequency band of a set of low frequency bands of the audio data. The adaptive low frequency compensation includes steps of: performing tonality detection on the audio data to generate compensation control data indicative of whether each frequency band in the set of low frequency bands has prominent tonal content; and performing low frequency compensation on the audio data in each frequency band in the set of low frequency bands having prominent tonal content as indicated by the compensation control data, but not performing low frequency compensation on the audio data in any other frequency band in the set of low frequency bands.