Audio signal compression method, audio signal compression apparatus, speech signal compression method, speech signal compression apparatus, speech recognition method, and speech recognition apparatus
    11.
    发明授权
    Audio signal compression method, audio signal compression apparatus, speech signal compression method, speech signal compression apparatus, speech recognition method, and speech recognition apparatus 有权
    音频信号压缩方法,音频信号压缩装置,语音信号压缩方法,语音信号压缩装置,语音识别方法和语音识别装置

    公开(公告)号:US06477490B2

    公开(公告)日:2002-11-05

    申请号:US09892745

    申请日:2001-06-28

    IPC分类号: G10L1906

    CPC分类号: H04B1/665 G10L2019/0005

    摘要: An audio signal compression apparatus for compressively coding an input audio signal comprises a time-to-frequency transformation unit for transforming the input audio signal to a frequency domain signal; a spectrum envelope calculation unit for calculating a spectrum envelope having different resolutions for different frequencies, from the input audio signal, using a weighting function on frequency based on human auditory characteristics; a normalization unit for normalizing the frequency domain signal using the spectrum envelope to obtain a residual signal; a power normalization unit for normalizing the residual signal by the power; an auditory weighting calculation unit for calculating weighting coefficients on frequency, based on the spectrum of the input audio signal and human auditory characteristics; and a multi-stage quantization device having plural stages of vector quantizers connected in series, to which the normalized residual signal is input, and at least one of the vector quantizers quantizing the residual signal using the weighting coefficients. Therefore, a low frequency band, which is auditively important, can be analyzed with a higher frequency resolution as compared with a high frequency band, whereby efficient signal compression utilizing human auditory characteristics is realized.

    摘要翻译: 一种用于对输入音频信号进行压缩编码的音频信号压缩装置包括用于将输入音频信号变换为频域信号的时间 - 频率变换单元; 频谱包络计算单元,用于根据输入的音频信号,使用基于人的听觉特征的频率的加权函数来计算用于不同频率的不同分辨率的频谱包络; 归一化单元,用于使用频谱包络对频域信号进行归一化以获得残余信号; 功率归一化单元,用于通过所述功率归一化所述残余信号; 听觉加权计算单元,用于基于输入音频信号的频谱和人类听觉特征来计算频率上的加权系数; 以及具有串联连接的多级矢量量化器的多级量化装置,其中输入归一化残差信号,以及使用加权系数量化残差信号的矢量量化器中的至少一个。 因此,与高频带相比,可以以更高的频率分辨率来分析具有重要意义的低频带,从而实现利用人类听觉特性的有效信号压缩。

    Speech recognition method and apparatus using frequency warping of linear prediction coefficients
    12.
    发明授权
    Speech recognition method and apparatus using frequency warping of linear prediction coefficients 有权
    使用线性预测系数的频率变形的语音识别方法和装置

    公开(公告)号:US06311153B1

    公开(公告)日:2001-10-30

    申请号:US09165297

    申请日:1998-10-02

    IPC分类号: G01L2100

    CPC分类号: H04B1/665 G10L2019/0005

    摘要: An audio signal compression apparatus for compressively coding an input audio signal comprises a time-to-frequency transformation unit for transforming the input audio signal to a frequency domain signal; a spectrum envelope calculation unit for calculating a spectrum envelope having different resolutions for different frequencies, from the input audio signal, using a weighting function on frequency based on human auditory characteristics; a normalization unit for normalizing the frequency domain signal using the spectrum envelope to obtain a residual signal; a power normalization unit for normalizing the residual signal by the power; an auditory weighting calculation unit for calculating weighting coefficients on frequency, based on the spectrum of the input audio signal and human auditory characteristics; and a multi-stage quantization device having plural stages of vector quantizers connected in series, to which the normalized residual signal is input, and at least one of the vector quantizers quantizing the residual signal using the weighting coefficients. Therefore, a low frequency band, which is auditively important, can be analyzed with a higher frequency resolution as compared with a high frequency band, whereby efficient signal compression utilizing human auditory characteristics is realized.

    摘要翻译: 一种用于对输入音频信号进行压缩编码的音频信号压缩装置包括用于将输入音频信号变换为频域信号的时间 - 频率变换单元; 频谱包络计算单元,用于根据输入的音频信号,使用基于人的听觉特征的频率的加权函数来计算用于不同频率的不同分辨率的频谱包络; 归一化单元,用于使用频谱包络对频域信号进行归一化以获得残余信号; 功率归一化单元,用于通过所述功率归一化所述残余信号; 听觉加权计算单元,用于基于输入音频信号的频谱和人类听觉特征来计算频率上的加权系数; 以及具有串联连接的多级矢量量化器的多级量化装置,其中输入归一化残差信号,以及使用加权系数量化残差信号的矢量量化器中的至少一个。 因此,与高频带相比,可以以更高的频率分辨率来分析具有重要意义的低频带,从而实现利用人类听觉特性的有效信号压缩。

    Energy shaping apparatus and energy shaping method
    13.
    发明授权
    Energy shaping apparatus and energy shaping method 有权
    能量整形设备和能量整形方法

    公开(公告)号:US08019614B2

    公开(公告)日:2011-09-13

    申请号:US12065378

    申请日:2006-08-31

    IPC分类号: G10L19/00

    摘要: A temporal processing apparatus includes: a splitter splitting an audio signal, included in the sub-band domain, into diffuse signals indicating reverberating components and direct signals indicating non-reverberating components; a downmix unit generating a downmix signal by downmixing the direct signals; BPFs respectively generating a bandpass downmix signal and bandpass diffuse signals; normalization processing units respectively generating a normalized downmix signal and normalized diffuse signals; a scale computation processing unit computing, on a predetermined time slot basis, a scale factor indicating the magnitude of energy of the normalized downmix signal with respect to energy of the normalized diffuse signals; a calculating unit generating scale diffuse signals; a HPF generating high-pass diffuse signals; an adding unit generating addition signals; and a synthesis filter bank performing synthesis filter processing on the addition signals and transforming the addition signals into the time domains.

    摘要翻译: 时间处理装置包括:分离器,将包括在子带域中的音频信号分成指示混响分量的漫射信号和指示非混响分量的直接信号; 下混合单元,通过使直接信号下混合来产生降混信号; BPF分别产生带通下混信号和带通漫射信号; 归一化处理单元,分别产生归一化的下混信号和归一化扩散信号; 比例计算处理单元在预定时隙的基础上计算指示归一化的下混信号相对于归一化扩散信号的能量的能量的大小的比例因子; 计算单元,生成缩放漫射信号; HPF产生高通漫反射信号; 添加单元生成附加信号; 以及合成滤波器组,对加法信号执行合成滤波处理,并将加法信号转换成时域。

    ENERGY SHAPING APPARATUS AND ENERGY SHAPING METHOD
    14.
    发明申请
    ENERGY SHAPING APPARATUS AND ENERGY SHAPING METHOD 有权
    能量成形装置和能量成形方法

    公开(公告)号:US20090234657A1

    公开(公告)日:2009-09-17

    申请号:US12065378

    申请日:2006-08-31

    IPC分类号: G10L21/00

    摘要: A temporal processing apparatus (energy shaping apparatus) (600a) includes: a splitter (601) splitting an audio signal, included in the sub-band domain, which are obtained through a hybrid time and frequency transformation into diffuse signals indicating reverberating components and direct signals indicating non-reverberating components; a downmix unit (604) generating a downmix signal by downmixing the direct signals; BPFs (605 and 606) respectively generating a bandpass downmix signal and bandpass diffuse signals, by performing bandpass processing on the downmix signal and the diffuse signals on a sub-band-to-sub-band basis, which are split on the sub-band basis; normalization processing units (607 and 608) respectively generating a normalized downmix signal and normalized diffuse signals by normalizing the bandpass downmix signal and the bandpass diffuse signals with regard to respective energy; a scale computation processing unit (609) computing, on a predetermined time slot basis, a scale factor indicating the magnitude of energy of the normalized downmix signal with respect to energy of the normalized diffuse signals; a calculating unit (611) generating scale diffuse signals by multiplying the normalized diffuse signals by the scale factor; a HPF (612) generating high-pass diffuse signals by performing high-pass processing on the scale diffuse signals; an adding unit (613) generating addition signals by adding the high-pass diffuse signals and the direct signals; and a synthesis filter bank (614) performing synthesis filter processing on the addition signals and transforming the addition signals into the time domains

    摘要翻译: 时间处理装置(能量整形装置)(600a)包括:分离器(601),将包括在子带域中的音频信号(通过混合时间和频率变换获得)分解成指示混响分量的漫射信号,并且直接 表示非混响分量的信号; 下混合单元(604),通过将所述直接信号进行下混合来产生下混合信号; BPF(605和606)分别产生带通下混合信号和带通扩散信号,通过对分频在子带上的下混信号和扩频信号进行带通处理,分散在子带 基础; 归一化处理单元(607和608),分别通过相对于各自的能量归一化带通下混合信号和带通漫射信号来产生归一化的下混合信号和归一化的扩散信号; 比例计算处理单元(609)在预定时隙的基础上计算指示归一化的下混合信号相对于归一化扩散信号的能量的能量的大小的比例因子; 计算单元(611),通过将归一化扩散信号乘以比例因子来生成缩放漫射信号; HPF(612)通过对刻度扩散信号进行高通处理来产生高通漫反射信号; 加法单元(613),通过加上高通漫反射信号和直接信号来产生加法信号; 以及合成滤波器组(614)对加法信号执行合成滤波处理,并将加法信号转换成时域

    Encoding device and decoding device
    15.
    发明授权
    Encoding device and decoding device 有权
    编码设备和解码设备

    公开(公告)号:US08108222B2

    公开(公告)日:2012-01-31

    申请号:US12836900

    申请日:2010-07-15

    IPC分类号: G10L21/00

    摘要: An encoding device (200) includes an MDCT unit (202) that transforms an input signal in a time domain into a frequency spectrum including a lower frequency spectrum, a BWE encoding unit (204) that generates extension data which specifies a higher frequency spectrum at a higher frequency than the lower frequency spectrum, and an encoded data stream generating unit (205) that encodes to output the lower frequency spectrum obtained by the MDCT unit (202) and the extension data obtained by the BWE encoding unit (204). The BWE encoding unit (204) generates as the extension data (i) a first parameter which specifies a lower subband which is to be copied as the higher frequency spectrum from among a plurality of the lower subbands which form the lower frequency spectrum obtained by the MDCT unit (202) and (ii) a second parameter which specifies a gain of the lower subband after being copied.

    摘要翻译: 一种编码装置(200)包括:将时域中的输入信号变换成包括较低频谱的频谱的MDCT单元(202),生成扩展数据的BWE编码单元(204),其生成指定较高频谱的扩展数据 频率高于较低频谱的编码数据流生成单元(205),其编码为输出由MDCT单元(202)获得的较低频谱和由BWE编码单元(204)获得的扩展数据。 BWE编码单元(204)生成作为扩展数据(i)的第一参数,该第一参数指定要从多个下个子带中复制的较低子带作为较高频谱,其形成由 MDCT单元(202)和(ii)指定复制后下一个子带的增益的第二参数。

    Encoding device and decoding device
    16.
    发明授权
    Encoding device and decoding device 有权
    编码设备和解码设备

    公开(公告)号:US07783496B2

    公开(公告)日:2010-08-24

    申请号:US12370203

    申请日:2009-02-12

    IPC分类号: G10L21/00

    摘要: An encoding device (200) includes an MDCT unit (202) that transforms an input signal in a time domain into a frequency spectrum including a lower frequency spectrum, a BWE encoding unit (204) that generates extension data which specifies a higher frequency spectrum at a higher frequency than the lower frequency spectrum, and an encoded data stream generating unit (205) that encodes to output the lower frequency spectrum obtained by the MDCT unit (202) and the extension data obtained by the BWE encoding unit (204). The BWE encoding unit (204) generates as the extension data (i) a first parameter which specifies a lower subband which is to be copied as the higher frequency spectrum from among a plurality of the lower subbands which form the lower frequency spectrum obtained by the MDCT unit (202) and (ii) a second parameter which specifies a gain of the lower subband after being copied.

    摘要翻译: 一种编码装置(200)包括:将时域中的输入信号变换成包括较低频谱的频谱的MDCT单元(202),生成扩展数据的BWE编码单元(204),其生成指定较高频谱的扩展数据 频率高于较低频谱的编码数据流生成单元(205),其编码为输出由MDCT单元(202)获得的较低频谱和由BWE编码单元(204)获得的扩展数据。 BWE编码单元(204)生成作为扩展数据(i)的第一参数,该第一参数指定要从多个下个子带中复制的较低子带作为较高频谱,其形成由 MDCT单元(202)和(ii)指定复制后下一个子带的增益的第二参数。

    Encoding device and decoding device
    17.
    发明授权
    Encoding device and decoding device 有权
    编码设备和解码设备

    公开(公告)号:US07509254B2

    公开(公告)日:2009-03-24

    申请号:US11508915

    申请日:2006-08-24

    IPC分类号: G10L19/00

    摘要: An encoding device (200) includes an MDCT unit (202) that transforms an input signal in a time domain into a frequency spectrum including a lower frequency spectrum, a BWE encoding unit (204) that generates extension data which specifies a higher frequency spectrum at a higher frequency than the lower frequency spectrum, and an encoded data stream generating unit (205) that encodes to output the lower frequency spectrum obtained by the MDCT unit (202) and the extension data obtained by the BWE encoding unit (204). The BWE encoding unit (204) generates as the extension data (i) a first parameter which specifies a lower subband which is to be copied as the higher frequency spectrum from among a plurality of the lower subbands which form the lower frequency spectrum obtained by the MDCT unit (202) and (ii) a second parameter which specifies a gain of the lower subband after being copied.

    摘要翻译: 一种编码装置(200)包括:将时域中的输入信号变换成包括较低频谱的频谱的MDCT单元(202),生成扩展数据的BWE编码单元(204),其生成指定较高频谱的扩展数据 频率高于较低频谱的编码数据流生成单元(205),其编码为输出由MDCT单元(202)获得的较低频谱和由BWE编码单元(204)获得的扩展数据。 BWE编码单元(204)生成作为扩展数据(i)的第一参数,该第一参数指定要从多个下个子带中复制的较低子带作为较高频谱,其形成由 MDCT单元(202)和(ii)指定复制后下一个子带的增益的第二参数。

    Encoding device and decoding device
    18.
    发明申请
    Encoding device and decoding device 有权
    编码设备和解码设备

    公开(公告)号:US20060287853A1

    公开(公告)日:2006-12-21

    申请号:US11508915

    申请日:2006-08-24

    IPC分类号: G10L19/00

    摘要: An encoding device (200) includes an MDCT unit (202) that transforms an input signal in a time domain into a frequency spectrum including a lower frequency spectrum, a BWE encoding unit (204) that generates extension data which specifies a higher frequency spectrum at a higher frequency than the lower frequency spectrum, and an encoded data stream generating unit (205) that encodes to output the lower frequency spectrum obtained by the MDCT unit (202) and the extension data obtained by the BWE encoding unit (204). The BWE encoding unit (204) generates as the extension data (i) a first parameter which specifies a lower subband which is to be copied as the higher frequency spectrum from among a plurality of the lower subbands which form the lower frequency spectrum obtained by the MDCT unit (202) and (ii) a second parameter which specifies a gain of the lower subband after being copied.

    摘要翻译: 一种编码装置(200)包括:将时域中的输入信号变换成包括较低频谱的频谱的MDCT单元(202),生成扩展数据的BWE编码单元(204),其生成指定较高频谱的扩展数据 频率高于较低频谱的编码数据流生成单元(205),其编码为输出由MDCT单元(202)获得的较低频谱和由BWE编码单元(204)获得的扩展数据。 BWE编码单元(204)生成作为扩展数据(i)的第一参数,该第一参数指定要从多个下个子带中复制的较低子带作为较高频谱,其形成由 MDCT单元(202)和(ii)指定复制后下一个子带的增益的第二参数。

    Encoding device decoding device
    19.
    发明授权
    Encoding device decoding device 有权
    编码设备解码设备

    公开(公告)号:US07283967B2

    公开(公告)日:2007-10-16

    申请号:US10285609

    申请日:2002-11-01

    IPC分类号: G10L19/00

    CPC分类号: G10L21/038 G10L19/0208

    摘要: An encoding device (100) includes (i) a first encoding unit (132) that encodes spectral data in the lower frequency band represented by a plularity of parameters, out of the spectral data obtained by transforming an audio signal inputted for a fixed time length, (ii) a second quantizing unit (133) that generates sub information representing characteristics of the spectral data in the higher frequency by fewer parameters than those for the lower frequency band, out of the spectral data obtained by the transformation, (iii) a second encoding unit (134) that encodes the generated sub information, and (iv) a stream output unit (140) that outputs the data encoded by the first encoding unit (132) and the data encoded by the second encoding unit (134).

    摘要翻译: 一种编码装置(100)包括:(i)第一编码单元(132),其通过将输入的固定时间长度的音频信号变换得到的频谱数据中编码由参数的多样性表示的较低频带中的频谱数据 ,(ii)第二量化单元(133),通过变换获得的频谱数据,产生表示较低频带的频谱数据的频谱数据的特性的子信息,(iii) 编码所生成的子信息的第二编码单元(134),以及(iv)输出由第一编码单元(132)编码的数据和由第二编码单元(134)编码的数据的流输出单元(140)。

    Band-division encoder utilizing a plurality of encoding units
    20.
    发明授权
    Band-division encoder utilizing a plurality of encoding units 有权
    带分割编码器利用多个编码单元

    公开(公告)号:US07246065B2

    公开(公告)日:2007-07-17

    申请号:US10353019

    申请日:2003-01-29

    IPC分类号: G10L19/00 G10L21/00

    CPC分类号: H04N19/61 H04N19/63

    摘要: An encoding device (200) is comprised of a band dividing unit (201) that divides an input signal (207) into a low frequency signal (208) representing a signal in the lower frequency band and a high frequency signal (209) representing a signal in the higher frequency band, a lower frequency band encoding unit (202) that encodes the low frequency signal (208) and generates a low frequency code (213), a similarity judging unit (203) that judges similarity between the high frequency signal (209) and the low frequency signal (208) and generates switching information (210), “n” higher frequency band encoding units 205 that encode the high frequency signal (209) through respective encoding methods and generate a high frequency code (212), a switching unit (204) that selects one of the higher frequency band encoding units (205) and has the selected higher frequency band encoding unit (205) perform encoding, and a code multiplexing unit (206) that multiplexes the low frequency code (213), the high frequency code (212) and the switching information (210), and generates an output code (214).

    摘要翻译: 编码装置(200)由将输入信号(207)分割为表示较低频带的信号的低频信号(208)的频带划分单元(201)和表示低频信号的高频信号 信号,较低频率编码单元(202),编码低频信号(208)并产生低频码(213);相似度判断单元(203),其判断高频信号 (209)和低频信号(208),并且通过各种编码方法产生切换信息(210)“n”个编码高频信号(209)的较高频带编码单元205,并生成高频码(212) ,选择所述较高频带编码单元(205)之一并具有所选择的较高频带编码单元(205)进行编码的切换单元(204),以及多路复用 低频码(213),高频码(212)和切换信息(210),并产生输出码(214)。