摘要:
A method (700, 800) and apparatus (100, 200) processes audio frames to transition between different codecs. The method can include producing (720), using a first coding method, a first frame of coded output audio samples by coding a first audio frame in a sequence of frames. The method can include forming (730) an overlap-add portion of the first frame using the first coding method. The method can include generating (740) a combination first frame of coded audio samples based on combining the first frame of coded output audio samples with the overlap-add portion of the first frame. The method can include initializing (760) a state of a second coding method based on the combination first frame of coded audio samples. The method can include constructing (770) an output signal based on the initialized state of the second coding method.
摘要:
A method (700, 800) and apparatus (100, 200) processes audio frames to transition between different codecs. The method can include producing (720), using a first coding method, a first frame of coded output audio samples by coding a first audio frame in a sequence of frames. The method can include forming (730) an overlap-add portion of the first frame using the first coding method. The method can include generating (740) a combination first frame of coded audio samples based on combining the first frame of coded output audio samples with the overlap-add portion of the first frame. The method can include initializing (760) a state of a second coding method based on the combination first frame of coded audio samples. The method can include constructing (770) an output signal based on the initialized state of the second coding method.
摘要:
During operation a multiple channel audio input signal is received and coded to generate a coded audio signal. A balance factor having balance factor components each associated with an audio signal of the multiple channel audio signal is generated. A gain value to be applied to the coded audio signal to generate an estimate of the multiple channel audio signal based on the balance factor and the multiple channel audio signal is determined, with the gain value configured to minimize a distortion value between the multiple channel audio signal and the estimate of the multiple channel audio signal. The representation of the gain value may be output for transmission and/or storage.
摘要:
A encoder/decoder architecture (200, 300, 700) that uses an arithmetic encoder (220) to encode the MSB portions of the output of a Factorial Pulse Coder (212), that encodes the output of a first-level source encoder (210), e.g., MDCT. Sub-parts (e.g., frequency bands) of portions (e.g., frames) of the signal are suitably sorted in increasing order based on a measure related to signal energy (e.g., signal energy itself). Doing this in a system (100) that overlays Arithmetic Encoding on Factorial Pulse coding results in bits being re-allocated to bands with higher signal energy content, ultimately yielding higher signal quality and higher bit utilization efficiency.
摘要:
A method for decoding an audio signal having a bandwidth that extends beyond a bandwidth of a CELP excitation signal in an audio decoder including a CELP-based decoder element. The method includes obtaining a second excitation signal having an audio bandwidth extending beyond the audio bandwidth of the CELP excitation signal, obtaining a set of signals by filtering the second excitation signal with a set of bandpass filters, scaling the set of signals using a set of energy-based parameters, and obtaining a composite output signal by combining the scaled set of signals with a signal based on the audio signal decoded by the CELP-based decoder element.
摘要:
Hybrid range coding/combinatorial coding (FPC) encoders and decoders are provided. Encoding and decoding can be dynamically switched between range coding and combinatorial according to the ratio of ones to the ratio of bits in a partial remaining sequence in order to reduce the computational complexity of encoding and decoding.
摘要:
A set of peaks in a reconstructed audio vector Ŝ of a received audio signal is detected and a scaling mask ψ(Ŝ) based on the detected set of peaks is generated. A gain vector g* is generated based on at least the scaling mask and an index j representative of the gain vector. The reconstructed audio signal is scaled with the gain vector to produce a scaled reconstructed audio signal. A distortion is generated based on the audio signal and the scaled reconstructed audio signal. The index of the gain vector based on the generated distortion is output.
摘要:
A method for encoding audio frames by producing a first frame of coded audio samples by coding a first audio frame in a sequence of frames, producing at least a portion of a second frame of coded audio samples by coding at least a portion of a second audio frame in the sequence of frames, and producing parameters for generating audio gap filler samples, wherein the parameters are representative of either a weighted segment of the first frame of coded audio samples or a weighted segment of the portion of the second frame of coded audio samples.
摘要:
A encoder/decoder architecture (200, 300, 700) that uses an arithmetic encoder (220) to encode the MSB portions of the output of a Factorial Pulse Coder (212), that encodes the output of a first-level source encoder (210), e.g., MDCT. Sub-parts (e.g., frequency bands) of portions (e.g., frames) of the signal are suitably sorted in increasing order based on a measure related to signal energy (e.g., signal energy itself). Doing this in a system (100) that overlays Arithmetic Encoding on Factorial Pulse coding results in bits being re-allocated to bands with higher signal energy content, ultimately yielding higher signal quality and higher bit utilization efficiency.
摘要:
A method and apparatus for prediction in a speech-coding system is provided herein. The method of a 1st order long-term predictor (LTP) filter, using a sub-sample resolution delay, is extended to a multi-tap LTP filter, or, viewed from another vantage point, the conventional integer-sample resolution multi-tap LTP filter is extended to use sub-sample resolution delay. This novel formulation of a multi-tap LTP filter offers a number of advantages over the prior-art LTP filter configurations. Particularly, defining the lag with sub-sample resolution makes it possible to explicitly model the delay values that have a fractional component, within the limits of resolution of the over-sampling factor used by the interpolation filter. The coefficients of such a multi-tap LTP filter are thus largely freed from modeling the effect of delays that have a fractional component. Consequently their main function is to maximize the prediction gain of the LTP filter via modeling the degree of periodicity that is present and by imposing spectral shaping.