MODEL BASED PREDICTION IN A CRITICALLY SAMPLED FILTERBANK

    公开(公告)号:EP4372602A3

    公开(公告)日:2024-07-10

    申请号:EP24166625.4

    申请日:2014-01-07

    发明人: VILLEMOES, Lars

    摘要: The present document relates to audio source coding systems. In particular, the present document relates to audio source coding systems which make use of linear prediction in combination with a filterbank. A method for estimating a first sample (615) of a first subband signal in a first subband of an audio signal is described. The first subband signal of the audio signal is determined using an analysis filterbank (612) comprising a plurality of analysis filters which provide a plurality of subband signals in a plurality of subbands from the audio signal, respectively. The method comprises determining a model parameter (613) of a signal model; determining a prediction coefficient to be applied to a previous sample (614) of a first decoded subband signals derived from the first subband signal, based on the signal model, based on the model parameter (613) and based on the analysis filterbank (612); wherein a time slot of the previous sample (614) is prior to a time slot of the first sample (615); and determining an estimate of the first sample (615) by applying the prediction coefficient to the previous sample (614).

    MODEL BASED PREDICTION IN A CRITICALLY SAMPLED FILTERBANK
    2.
    发明公开
    MODEL BASED PREDICTION IN A CRITICALLY SAMPLED FILTERBANK 有权
    EINER KRITISCH ABGETASTETEN FILTERBANK中的型号

    公开(公告)号:EP2943953A1

    公开(公告)日:2015-11-18

    申请号:EP14701146.4

    申请日:2014-01-07

    发明人: VILLEMOES, Lars

    IPC分类号: G10L19/093

    摘要: The present document relates to audio source coding systems. In particular, the present document relates to audio source coding systems which make use of linear prediction in combination with a filterbank. A method for estimating a first sample (615) of a first subband signal in a first subband of an audio signal is described. The first subband signal of the audio signal is determined using an analysis filterbank (612) comprising a plurality of analysis filters which provide a plurality of subband signals in a plurality of subbands from the audio signal, respectively. The method comprises determining a model parameter (613) of a signal model; determining a prediction coefficient to be applied to a previous sample (614) of a first decoded subband signals derived from the first subband signal, based on the signal model, based on the model parameter (613) and based on the analysis filterbank (612); wherein a time slot of the previous sample (614) is prior to a time slot of the first sample (615); and determining an estimate of the first sample (615) by applying the prediction coefficient to the previous sample (614).

    摘要翻译: 本文件涉及音频源编码系统。 特别地,本文件涉及利用线性预测与滤波器组组合的音频源编码系统。 描述了用于估计音频信号的第一子带中的第一子带信号的第一样本(615)的方法。 使用包括多个分析滤波器的分析滤波器组(612)来确定音频信号的第一子带信号,分析滤波器分别从音频信号在多个子带中提供多个子带信号。 该方法包括确定信号模型的模型参数(613); 基于所述信号模型,基于所述模型参数(613)并且基于所述分析滤波器组(612),确定要应用于从所述第一子带信号导出的第一解码子带信号的先前采样(614)的预测系数, ; 其中所述先前样本(614)的时隙在所述第一样本(615)的时隙之前; 以及通过将预测系数应用于先前样本(614)来确定第一样本(615)的估计。

    CODING, MODIFICATION AND SYNTHESIS OF SPEECH SEGMENTS
    5.
    发明公开
    CODING, MODIFICATION AND SYNTHESIS OF SPEECH SEGMENTS 有权
    加密,修改和语段合成

    公开(公告)号:EP2517197A1

    公开(公告)日:2012-10-31

    申请号:EP10801161.0

    申请日:2010-12-21

    申请人: Telefónica, S.A.

    IPC分类号: G10L13/02 G10L13/06

    摘要: The invention relates to a method for speech signal analysis, modification and synthesis comprising a phase for the location of analysis windows by means of an iterative process for the determination of the phase of the first sinusoidal component and comparison between the phase value of said component and a predetermined value, a phase for the selection of analysis frames corresponding to an allophone and readjustment of the duration and the fundamental frequency according to certain thresholds and a phase for the generation of synthetic speech from synthesis frames taking the information of the closest analysis frame as spectral information of the synthesis frame and taking as many synthesis frames as periods that the synthetic signal has. The method allows a coherent location of the analysis windows within the periods of the signal and the exact generation of the synthesis instants in a manner synchronous with the fundamental period.

    Method and apparatus for decoding audio signal
    7.
    发明公开
    Method and apparatus for decoding audio signal 有权
    Verfahren und Vorrichtung zur Dekodierung von Tonsignalen

    公开(公告)号:EP2357649A1

    公开(公告)日:2011-08-17

    申请号:EP11151588.8

    申请日:2011-01-20

    IPC分类号: G10L19/08 G10L19/14

    CPC分类号: G10L19/093 G10L19/24

    摘要: Provided are a method and an apparatus for decoding an audio signal. A method for decoding an audio signal encoded by a layered sinusoidal pulse coding scheme using one or more sinusoidal pulses includes decoding the encoded audio signal, setting a smoothing frequency band of the decoded audio signal according to a layer structure of the layered sinusoidal pulse coding scheme, dividing the smoothing frequency band into one or more subbands, and smoothing the decoded audio signal on a subband-by-subband basis. Accordingly, a decoding operation time can be reduced and the quality of a synthesized signal can be improved by variably setting a frequency band to be smoothed, when decoding an audio signal encoded by a layered sinusoidal pulse coding scheme using one or more sinusoidal pulses.

    摘要翻译: 提供了一种用于对音频信号进行解码的方法和装置。 一种使用一个或多个正弦脉冲对由分层正弦脉冲编码方案编码的音频信号进行解码的方法包括解码编码音频信号,根据分层正弦脉冲编码方案的层结构设置解码音频信号的平滑频带 将平滑频带划分成一个或多个子带,并且在逐个子带的基础上平滑解码的音频信号。 因此,当使用一个或多个正弦脉冲对由分层正弦脉冲编码方案编码的音频信号进行解码时,可以通过可变地设置待平滑的频带来降低解码操作时间并且可以提高合成信号的质量。

    MULTIMODE CODING OF SPEECH-LIKE AND NON-SPEECH-LIKE SIGNALS
    8.
    发明公开
    MULTIMODE CODING OF SPEECH-LIKE AND NON-SPEECH-LIKE SIGNALS 有权
    多模态编码语音喜欢的语言不同SIGNALS

    公开(公告)号:EP2269188A1

    公开(公告)日:2011-01-05

    申请号:EP09720866.4

    申请日:2009-03-12

    IPC分类号: G10L19/08 G10L19/12

    摘要: The invention relates to the coding of audio signals that may include both speech-like and non-speech-like signal components. It describes methods and apparatus for code excited linear prediction (CELP) audio encoding and decoding that employ linear predictive coding (LPC) synthesis filters controlled by LPC parameters, a plurality of codebooks each having codevectors, at least one codebook providing an excitation more appropriate for non-speech-like signals and at least one codebook providing an excitation more appropriate for speech-like signals, and a plurality of gain factors, each associated with a codebook. The encoding methods and apparatus select from the codebooks codevectors and/or associated gain factors by minimizing a measure of the difference between the audio signal and a reconstruction of the audio signal derived from the codebook excitations. The decoding methods and apparatus generate a reconstructed output signal from the LPC parameters, codevectors, and gain factors.

    AUDIO ENCODING
    10.
    发明授权
    AUDIO ENCODING 有权
    声音编码

    公开(公告)号:EP1676263B1

    公开(公告)日:2009-12-16

    申请号:EP04770161.0

    申请日:2004-10-04

    IPC分类号: G10L19/08 G10L19/02

    CPC分类号: G10L19/032 G10L19/093

    摘要: Coding of an audio signal (x) represented by a respective set of sampled signal values (x(t)) for each of a plurality of sequential time segments is disclosed. The sampled signal values are analyzed to determine one or more sinusoidal components for each of the plurality of sequential segments. The sinusoidal components are linked across a plurality of sequential segments to provide sinusoidal tracks, where each track comprises a number of frames. An encoded signal (AS) is generated, including sinusoidal codes (Cs) comprising a representation level (r) for each frame or including sinusoidal codes (Cs) where some of these codes comprise a phase (ϕ), a frequency (ω) and a quantization table (Q) for a given frame when the given frame is designated as a random-access frame. The invention allows random access in a track while avoiding long adaptation of the quantization accuracy in a quantizer and/or the need for a large bit stream while still maintaining improved audio quality.