摘要:
An apparatus for encoding an audio or image signal, comprises: a controllable windower (102) for windowing the audio or image signal to provide the sequence of blocks of windowed samples; a converter (104) for converting the sequence of blocks of windowed samples into a spectral representation comprising a sequence of frames of spectral values; a transient location detector (106) for identifying a location of a transient within a transient look-ahead region of a frame; and a controller (108) for controlling the controllable windower (102) to apply a specific window having a specified overlap length to the audio or image signal in response to an identified location (210-213) of the transient, wherein the controller (108) is configured to select the specific window from a group of at least three windows comprising a first window (201) having a first overlap length (203), a second window (215) having a second overlap length (218), and a third window (224) having a third overlap length (229) or having no overlap, wherein the first overlap length (203) is greater than the second overlap length (218), and wherein the second overlap length (218) is greater than the third overlap length (229) or greater than an overlap of zero, wherein the specific window is selected based on the transient location such that one of two time-adjacent overlapping windows has coefficients at the location of the transient and the other of the two time-adjacent overlapping windows has second window coefficients at the location of the transient, wherein the second coefficients are at least nine times greater than the first coefficients.
摘要:
An audio decoder (100; 300) for providing a decoded audio information (112;312) on the basis of an encoded audio information (110; 310) comprises an error concealment (130; 380; 500) configured to provide an error concealment audio information (132;382;512) for concealing a loss of an audio frame following an audio frame encoded in a frequency domain representation (322) using a time domain excitation signal (532).
摘要:
The present invention relates to an audio encoder (100, 101) for encoding an audio signal (PCM i ) comprising an pulse portion (P) and a stationary portion, comprising: a pulse extractor (11,110) configured for extracting the pulse portion (P) from the audio signal (PCM i ), further comprising a pulse coder (132) for encoding the extracted pulse portion (P) to acquire an encoded pulse portion (CP); wherein the pulse extractor (110) is configured to determine a spectrogram of the audio signal (PCM i ) to extract the pulse portion (P), wherein the spectrogram having higher time resolution than the signal encoder (152, 156'); a signal encoder (152, 156') configured for encoding a residual (R) signal derived from the audio signal (PCM i ) to acquire an encoded residual (CR) signal, the residual (R) signal being derived from the audio signal (PCM i ) so that the pulse portion (P) is reduced or eliminated from the audio signal (PCM i ); and an output interface (170) configured for outputting the encoded pulse portion (CP) and the encoded residual (CR) signal to provide an encoded signal.
摘要:
An audio decoder (100; 300) for providing a decoded audio information (112;312) on the basis of an encoded audio information (110; 310) comprises an error concealment (130; 380; 500) configured to provide an error concealment audio information (132;382;512) for concealing a loss of an audio frame following an audio frame encoded in a frequency domain representation (322) using a time domain excitation signal (532). The time domain excitation signal or copies thereof are modified by applying a gradual reducing of a gain depending on the pitch period length or depending on a pitch change per time unit or depending on whether a pitch prediction fails or succeeds.
摘要:
An audio decoder (100; 300) for providing a decoded audio information (112;312) on the basis of an encoded audio information (110; 310) comprises an error concealment (130; 380; 500) configured to provide an error concealment audio information (132;382;512) for concealing a loss of an audio frame following an audio frame encoded in a frequency domain representation (322) using a time domain excitation signal (532).
摘要:
Noise filling in perceptual transform audio codecs is improved by performing the noise filling with a spectrally global tilt, rather than in a spectrally flat manner.
摘要:
An apparatus for determining an estimated pitch lag is provided. The apparatus comprises an input interface (110) for receiving a plurality of original pitch lag values, and a pitch lag estimator (120) for estimating the estimated pitch lag. The pitch lag estimator (120) is configured to estimate the estimated pitch lag depending on a plurality of original pitch lag values and depending on a plurality of information values, wherein for each original pitch lag value of the plurality of original pitch lag values, an information value of the plurality of information values is assigned to said original pitch lag value.