Abstract:
Disclosed is a binaural rendering method and apparatus for decoding a multichannel audio signal. The binaural rendering method may include: extracting an early reflection component and a late reverberation component from a binaural filter; generating a stereo audio signal by performing binaural rendering of a multichannel audio signal base on the early reflection component; and applying the late reverberation component to the generated stereo audio signal.
Abstract:
Provided is a tag insertion method performed by an apparatus for inserting a tag into a stereo audio signal, the method including receiving an original stereo audio signal, analyzing an energy distribution of the original stereo audio signal based on an azimuth, determining valid azimuths for control information and for a plurality of pieces of tag information based on the energy distribution, wherein the control information is used to control tag information, modulating the plurality of pieces of tag information and the control information generated based on the valid azimuths, generating a left signal and a right signal based on the modulated control information and the plurality of pieces of modulated tag information, and generating a multi-tagged stereo audio signal by mixing the generated left signal and the generated right signal with the original stereo audio signal.
Abstract:
An apparatus and method for generating audio data and an apparatus and method for playing audio data may be disclosed, in which the apparatus for playing the audio data may extract a descriptor related to a multichannel audio signal from a bitstream generated by the apparatus for generating the audio data, and play the multichannel audio signal, based on the extracted descriptor, and the descriptor may include information on an audio signal included in the multichannel audio signal.
Abstract:
An audio encoding apparatus to encode an audio signal using lossless coding or lossy coding and an audio decoding apparatus to decode an encoded audio signal are disclosed. An audio encoding apparatus according to an exemplary embodiment may include an input signal type determination unit to determine a type of an input signal based on characteristics of the input signal, a residual signal generation unit to generate a residual signal based on an output signal from the input signal type determination unit, and a coding unit to perform lossless coding or lossy coding using the residual signal.
Abstract:
An apparatus and method for transmitting a plurality of audio objects using a multichannel encoder and a multichannel decoder are provided. The audio object encoder includes a multichannel encoder determination unit to determine a multichannel encoder to be used for encoding of a plurality of audio objects according to the number of the audio objects, an encoding unit to generate an encoded signal by encoding the plurality of audio objects using the determined multichannel encoder, and a multichannel audio object signal generation unit to generating a multichannel audio object signal, by multiplexing sound image localization information of the plurality of audio objects along with the encoded signal.
Abstract:
An apparatus and method for generating audio data and an apparatus and method for playing audio data may be disclosed, in which the apparatus for playing the audio data may extract a descriptor related to a multichannel audio signal from a bitstream generated by the apparatus for generating the audio data, and play the multichannel audio signal, based on the extracted descriptor, and the descriptor may include information on an audio signal included in the multichannel audio signal.
Abstract:
An audio signal encoding and decoding method using a neural network model, and an encoder and decoder for performing the same are disclosed. A method of encoding an audio signal using a neural network model, the method may include identifying an input signal, generating a quantized latent vector by inputting the input signal into a neural network model encoding the input signal, and generating a bitstream corresponding to the quantized latent vector, wherein the neural network model may include i) a feature extraction layer generating a latent vector by extracting a feature of the input signal, ii) a plurality of downsampling blocks downsampling the latent vector, and iii) a plurality of quantization blocks performing quantization of a downsampled latent vector.
Abstract:
Provided is an encoding method according to various example embodiments and an encoder performing the method. The encoding method includes outputting a linear prediction (LP) coefficients bitstream and a residual signal by performing a linear prediction analysis on an input signal, outputting a first latent signal obtained by encoding a periodic component of the residual signal, using a first neural network module, outputting a first bitstream obtained by quantizing the first latent signal, using a quantization module, outputting a second latent signal obtained by encoding an aperiodic component of the residual signal, using the first neural network module, and outputting a second bitstream obtained by quantizing the second latent signal, using the quantization module, wherein the aperiodic component of the residual signal is calculated based on a periodic component of the residual signal decoded from the quantized first latent signal output by de-quantizing the first bitstream.
Abstract:
An encoding apparatus and a decoding apparatus in a transform between a Modified Discrete Cosine Transform (MDCT)-based coder and a different coder are provided. The encoding apparatus may encode additional information to restore an input signal encoded according to the MDCT-based coding scheme, when switching occurs between the MDCT-based coder and the different coder. Accordingly, an unnecessary bitstream may be prevented from being generated, and minimum additional information may be encoded.
Abstract:
A method of generating a residual signal performed by an encoder includes identifying an input signal including an audio sample, generating a first residual signal from the input signal using linear predictive coding (LPC), generating a second residual signal having a less information amount than the first residual signal by transforming the first residual signal, transforming the second residual signal into a frequency domain, and generating a third residual signal having a less information amount than the second residual signal from the transformed second residual signal using frequency-domain prediction (FDP) coding.