Abstract:
The present document relates to audio encoding / decoding. In particular, the present document relates to a method and system for reducing the complexity of a bit allocation process used in the context of audio encoding / decoding. An audio encoder (300) configured to encode an audio signal according to a first audio codec system is described. The audio encoder (300) comprises a transform unit (302) configured to determine a set of spectral coefficients (312) based on the audio signal. Furthermore, the encoder (300) comprises a floating-point encoding unit (304) configured to determine a set of scale factors and a set of scaled values (314), based on the set of spectral coefficients (312); and to encode the set of scale factors to yield a set of encoded scale factors (313). In addition, the encoder (300) comprises a bit allocation and quantization unit (305, 306) configured to determine a total number of available bits for quantizing the set of scaled values (314), based on a first target data-rate and based on the number of bits used for the set of encoded scale factors (313); to determine a first control parameter (315) indicative of an allocation of the total number of available bits for quantizing the scaled values of the set of scaled values (314); and to quantize the set of scaled values (314) in accordance to the first control parameter (315) to yield a set of quantized scaled values (317). Furthermore, the encoder (300) comprises a transcoding simulation unit (320) configured to determine a second control parameter (321) based on the first control parameter (315); wherein the second control parameter (321) enables a transcoder to convert the first bitstream into a second bitstream at a second target data-rate; wherein the second bitstream accords to a second audio codec system different from the first audio codec system; and wherein the first bitstream comprises the second control parameter.
Abstract:
The present document relates to the technical field of audio coding, decoding and processing. It specifically relates to methods of recovering high frequency content of an audio signal from low frequency content of the same audio signal in an efficient manner. A method for determining a first banded tonality value (311, 312) for a first frequency subband (205) of an audio signal is described. The first banded tonality value (311, 312) is used for approximating a high frequency component of the audio signal based on a low frequency component of the audio signal. The method comprises determining a set of transform coefficients in a corresponding set of frequency bins based on a block of samples of the audio signal; determining a set of bin tonality values (341 ) for the set of frequency bins using the set of transform coefficients, respectively; and combining a first subset of two or more of the set of bin tonality values (341) for two or more corresponding adjacent frequency bins of the set of frequency bins lying within the first frequency subband, thereby yielding the first banded tonality value (311, 312) for the first frequency subband.
Abstract:
The present invention relates to techniques for authentication of data streams. Specifically, the invention relates to the insertion of identifiers into a data stream, such as a Dolby Pulse, AAC or HE AAC bitstream, and the authentication and verification of the data stream based on such identifiers. A method and system for encoding a data stream comprising a plurality of data frames is described. The method comprises the step of generating a cryptographic value of a number N of successive data frames and configuration information, wherein the configuration information comprises information for rendering the data stream. The method then inserts the cryptographic value into the data stream subsequent to the N successive data frames.
Abstract:
A method for determining mantissa bit allocation of audio data values of frequency domain audio data to be encoded. The allocation method includes a step of determining masking values for the audio data values, including by performing adaptive low frequency compensation on the audio data of each frequency band of a set of low frequency bands of the audio data. The adaptive low frequency compensation includes steps of: performing tonality detection on the audio data to generate compensation control data indicative of whether each frequency band in the set of low frequency bands has prominent tonal content; and performing low frequency compensation on the audio data in each frequency band in the set of low frequency bands having prominent tonal content as indicated by the compensation control data, but not performing low frequency compensation on the audio data in any other frequency band in the set of low frequency bands.
Abstract:
The present document relates to processing of multimedia data, notably the encoding, the transmission, the decoding and the rendering of multimedia data, e.g. audio files or bitstreams. In particular, the present document relates to the implementation of loudness control in multimedia players. A method for providing loudness related data to a media player is described. The method comprises the steps of providing a first loudness related value associated with an audio signal; wherein the first loudness related value has been determined according to a first procedure; of converting the first loudness related value into a second loudness related value using a reversible relation; wherein the second loudness related value is associated with a second procedure for determining loudness related values; of storing the second loudness related value in metadata associated with the audio signal; and of providing the metadata to the media player.
Abstract:
Many portable playback devices cannot decode and playback encoded audio content having wide bandwidth and wide dynamic range with consistent loudness and intelligibility unless the encoded audio content has been prepared specially for these devices. This problem can be overcome by including with the encoded content some metadata that specifies a suitable dynamic range compression profile by either absolute values or differential values relative to another known compression profile. A playback device may also adaptively apply gain and limiting to the playback audio. Implementations in encoders, in transcoders and in decoders are disclosed.
Abstract:
The present disclosure relates to methods and apparatus for audio coding. A method of encoding a portion of an audio signal comprises determining whether the portion of the audio signal is likely to contain dense transient events, and if it is determined that the portion of the audio signal is likely to contain dense transient events, quantizing the portion of the audio signal using a quantization 5 mode that applies a substantially constant signal-to-noise ratio over frequency for the portion of the audio signal. The present disclosure further relates to a method of detecting dense transient events in a portion of an audio signal.
Abstract:
The present document relates to methods and systems for music information retrieval (MIR). In particular, the present document relates to methods and systems for extracting a chroma vector from an audio signal. A method (900) for determining a chroma vector (100) for a block of samples of an audio signal (301) is described. The method (900) comprises receiving (901) a corresponding block of frequency coefficients derived from the block of samples of the audio signal (301) from a core encoder (412) of a spectral band replication based audio encoder (410) adapted to generate an encoded bitstream (305) of the audio signal (301) from the block of frequency coefficients; and determining (904) the chroma vector (100) for the block of samples of the audio signal (301) based on the received block of frequency coefficients.
Abstract:
The invention discloses a method and an encoder for processing a digital audio stereo signal. A digital audio encoder for coding such audio signal comprises a predictive Temporal Noise Shaping (TNS) filter, a Mid-/Side (M/S) coding unit, a control unit for determining a first prediction gain related to the unmodified L/R signal processed by the TNS filter and for determining a second prediction gain related to the M/S-coded L/R signal processed by the TNS filter, wherein the control unit is adapted to disable TNS-filtering - i.e. to bypass the TNS filter - for a current signal frame, if the first and second prediction gains differ by more than a pre-determined mismatch range. Preferably, the first and second prediction gains are determined from signal energy ratios calculated for each channel of the stereo signal including the signal energies of both the TNS-processed (unmodified) L- respectively (unmodified) R-signal and the TNS-processed M/S coded L- respectively M/S coded R-signal divided by the respective signal energies before TNS processing. Furthermore, the control unit is preferably adapted to overrule the disabling of the TNS filter, if the input signal is a near-mono audio signal exhibiting only low energy either in its M- or S-band. In that case, operation of the TNS filter on the stereo audio signal is maintained.
Abstract:
The present document relates to methods and systems for estimating the tempo of a media signal, such as audio or combined video/audio signal. In particular, the document relates to the estimation of tempo perceived by human listeners, as well as to methods and systems for tempo estimation at scalable computational complexity. A method and system for extracting tempo information of an audio signal from an encoded bit-stream of the audio signal comprising spectral band replication data is described. The method comprises the steps of determining a payload quantity associated with the amount of spectral band replication data comprised in the encoded bit-stream for a time interval of the audio signal; repeating the determining step for successive time intervals of the encoded bit- stream of the audio signal, thereby determining a sequence of payload quantities; identifying a periodicity in the sequence of payload quantities; and extracting tempo information of the audio signal from the identified periodicity.