Method and apparatus for speech encoding and decoding by sinusoidal analysis and waveform encoding with phase reproducibility
    1.
    发明授权
    Method and apparatus for speech encoding and decoding by sinusoidal analysis and waveform encoding with phase reproducibility 失效
    通过正弦分析和具有相位再现性的波形编码进行语音编码和解码的方法和装置

    公开(公告)号:US07454330B1

    公开(公告)日:2008-11-18

    申请号:US08736546

    申请日:1996-10-24

    IPC分类号: G10L19/14

    摘要: A speech encoding method and apparatus in which an input speech signal is divided in terms of blocks or frames as encoding units and encoded in terms of the encoding units, whereby explosive and fricative consonants can be impeccably reproduced, while there is an attenuation of the occurrence of foreign sounds being generated at a transient portion between voiced (V) and unvoiced (UV) portions, so that the speech with high clarity devoid of “stuffed” feeling may be produced. The encoding apparatus includes a first encoding unit for finding residuals of linear predictive coding (LPC) of an input speech signal for performing harmonic coding and a second encoding unit for encoding the input speech signal by waveform coding. The first encoding unit and the second encoding unit are used for encoding a voiced (V) portion and an unvoiced (UV) portion of the input signal, respectively. Code excited linear prediction (CELP) encoding employing vector quantization by a closed loop search of an optimum vector using an analysis-by-synthesis method is used for the second encoding unit. A corresponding decoding method and apparatus is also provided.

    摘要翻译: 一种语音编码方法和装置,其中输入语音信号以块或帧为单位编码,并以编码单位编码,由此可以无可挑剔地复制爆炸和摩擦辅音,同时存在衰减的发生 在V(V)和无声(UV)部分之间的瞬态部分产生外来声音,从而可能产生具有高“透明度”感的语音。 编码装置包括:第一编码单元,用于求出用于执行谐波编码的输入语音信号的线性预测编码(LPC)的残差;以及第二编码单元,用于通过波形编码对输入的语音信号进行编码。 第一编码单元和第二编码单元分别用于对输入信号的有声(V)部分和无声(UV)部分进行编码。 第二编码单元使用通过使用合成分析法的最佳向量的闭环搜索采用矢量量化的码激励线性预测(CELP)编码。 还提供了相应的解码方法和装置。

    Voiced/unvoiced decision using a plurality of sigmoid-transformed
parameters for speech coding
    2.
    发明授权
    Voiced/unvoiced decision using a plurality of sigmoid-transformed parameters for speech coding 失效
    使用多个S形变换参数进行语音编码的发声/清音决定

    公开(公告)号:US06023671A

    公开(公告)日:2000-02-08

    申请号:US833970

    申请日:1997-04-11

    CPC分类号: G10L25/93

    摘要: A method and apparatus for voiced/unvoiced decision for judging whether an input speech signal is voiced or unvoiced. The input parameters for performing the voiced/unvoiced (V/UV) decision are comprehensively judged in order to enable high-precision V/UV decision by a simplified algorithm. Parameters for the voiced/unvoiced (V/UV) decision include the frame-averaged energy of the input speech signal lev, the normalized autocorrelation peak value r0r, the spectral similarity degree pos, the number of zero crossings nZero, and the pitch lag pch. If these parameters are denoted by x, these parameters are converted by function calculation circuits using a sigmoid function g(x) represented byg(x)=A/(1+exp (-(x-b)/a))where A, a, and b are constants differing with each input parameter. Using the parameters converted by this sigmoid function g(x), the voiced/unvoiced decision is made a V/UV decision circuit.

    摘要翻译: 用于用于判断输入语音信号是有声还是无声的有声/无声决定的方法和装置。 综合判断用于执行有声/无声(V / UV)判定的输入参数,以便通过简化算法实现高精度V / UV判定。 有声/无声(V / UV)决定的参数包括输入语音信号lev的帧平均能量,归一化自相关峰值r0r,频谱相似度pos,过零次数nZero和音调滞后pch 。 如果这些参数由x表示,这些参数由函数计算电路使用由g(x)= A /(1 + exp( - (xb)/ a))表示的S形函数g(x)转换,其中A,a, b是与每个输入参数不同的常数。 使用由该S形函数g(x)转换的参数,将有声/无声决定作为V / UV判定电路。

    Method and apparatus for decoding and changing the pitch of an encoded
speech signal
    3.
    发明授权
    Method and apparatus for decoding and changing the pitch of an encoded speech signal 失效
    用于对编码语音信号进行解码和改变音调的方法和装置

    公开(公告)号:US5873059A

    公开(公告)日:1999-02-16

    申请号:US736989

    申请日:1996-10-25

    摘要: A method and apparatus for reproducing speech signals at a controlled speed and for synthesizing speech includes a dividing unit that divides the input speech into time segments and an encoding unit that discriminates whether each of the speech segments is voiced or unvoiced. Based on the results of the discrimination, the encoding unit performs sinusoidal synthesis and encoding for voiced segments and vector quantization by closed-loop search for an optimum vector using an analysis-by-synthesis method for unvoiced segments in order to find encoded parameters. A period modification unit modifies the length of time associated with each signal segment and calculates a set of modified encoded parameters. In the speech synthesizing unit, encoded speech signal data is output from the encoding unit and pitch data and amplitude data specifying the spectral envelope are sent via a data conversion unit to a waveform synthesis unit, where the number of amplitude data points of the spectral envelope is changed without changing the shape of the spectral envelope, so that the pitch of the signal may be varied without changing its phoneme. A waveform synthesis unit synthesizes the speech waveform based on the converted spectral envelope data and pitch data.

    摘要翻译: 用于以受控速度再现语音信号并用于合成语音的方法和装置包括将输入语音划分成时间段的分割单元和鉴别每个语音段是有声还是无声的编码单元。 基于鉴别的结果,编码单元通过使用用于清音段的合成分析方法对最佳向量进行闭环搜索,对浊音段和矢量量化进行正弦合成和编码,以便找到编码参数。 周期修改单元修改与每个信号段相关联的时间长度,并计算一组经修改的编码参数。 在语音合成单元中,编码语音信号数据从编码单元输出,音调数据和指定频谱包络的​​振幅数据经由数据转换单元发送到波形合成单元,其中频谱包络的​​振幅数据点的数量 在不改变频谱包络的​​形状的情况下改变,使得信号的音调可以改变而不改变其音素。 波形合成单元基于转换的频谱包络数据和音调数据来合成语音波形。

    Apparatus and method for encoding/decoding a speech signal using
adaptively changing codebook vectors
    4.
    发明授权
    Apparatus and method for encoding/decoding a speech signal using adaptively changing codebook vectors 失效
    使用自适应变化的码本矢量对语音信号进行编码/解码的装置和方法

    公开(公告)号:US5828996A

    公开(公告)日:1998-10-27

    申请号:US736988

    申请日:1996-10-25

    CPC分类号: G10L19/04 G10L19/12

    摘要: An encoding apparatus in which an input speech signal is divided into blocks and encoded in units of blocks. The encoding apparatus includes an encoding unit for performing CELP encoding having a noise codebook memory containing having codebook vectors generated by clipping Gaussian noise and codebook vectors obtained by learning using the code vectors generated by clipping the Gaussian noise as initial values. The encoding apparatus enables optimum encoding for a variety of speech configurations.

    摘要翻译: 一种编码装置,其中输入语音信号被分成块并以块为单位编码。 该编码装置包括编码单元,用于执行CELP编码,该编码单元具有噪声码本存储器,该噪声码本存储器包含通过使用通过限幅高斯噪声产生的代码矢量进行学习而获得的通过削波高斯噪声和码本矢量生成的码本矢量作为初始值。 编码装置能够对各种语音配置进行最佳编码。

    Speech decoding method and apparatus
    6.
    发明授权
    Speech decoding method and apparatus 失效
    语音解码方法及装置

    公开(公告)号:US5752222A

    公开(公告)日:1998-05-12

    申请号:US736342

    申请日:1996-10-23

    CPC分类号: G10L19/26

    摘要: A speech decoding method and apparatus for decoding encoded speech signals and subsequently post-filtering the decoded signals, wherein the filter coefficient of a spectral shaping filter in a post-filter fed with an encoded and subsequently decoded speech signal is updated with a sub-frame period, while the gain of a gain adjustment circuit for correcting gain changes caused by the spectral shaping is updated with a frame period that is eight times as long as the sub-frame period. This achieves switching of the filter coefficient so as to be changed smoothly with a higher follow-up speed, while suppressing level changes otherwise caused by frequent gain switching. The result is improved characteristics of a post-filter used for spectral shaping of a decoded signal supplied from the signal decoder and more effective post-filter processing.

    摘要翻译: 一种用于对编码的语音信号进行解码并随后对解码的信号进行后置滤波的语音解码方法和装置,其中,用编码和随后解码的语音信号馈送的后置滤波器中的频谱整形滤波器的滤波器系数用子帧 而用于校正由频谱整形引起的增益变化的增益调整电路的增益是以子帧周期的8倍的帧周期来更新的。 这实现了滤波器系数的切换以便以更高的跟随速度平滑地改变,同时抑制另外由频繁增益切换引起的电平变化。 结果是用于从信号解码器提供的解码信号的频谱整形的后置滤波器的改进的特性以及更有效的后置滤波处理。

    Pitch extraction method and device utilizing autocorrelation of a
plurality of frequency bands
    7.
    发明授权
    Pitch extraction method and device utilizing autocorrelation of a plurality of frequency bands 失效
    使用多个频带的自相关的音调提取方法和装置

    公开(公告)号:US5930747A

    公开(公告)日:1999-07-27

    申请号:US788194

    申请日:1997-01-24

    IPC分类号: G10L11/04 H03H17/02 G10L9/08

    CPC分类号: G10L25/90 G10L25/06 G10L25/18

    摘要: A pitch extraction method and apparatus whereby the pitch of a speech signal having various characteristics can be extracted accurately. The frame-based input speech signal, band-limited by an HPF 12 and an LPF 16, is sent to autocorrelation computing units 13, 17 where autocorrelation data is found. The pitch lag is computed and normalized in the pitch intensity/pitch lag computing units 14, 18. The pitch reliability of the input speech signals, limited by the HPF 12 and the LPF 16, is computed in elevation parameter calculation units. A selection unit 20 selects one of the parameters obtained from the input speech signal, limited by the HPF 12 and the LPF 16, using the pitch lag and the evaluation parameter.

    摘要翻译: 可以精确地提取具有各种特征的语音信号的音高的音调提取方法和装置。 由HPF 12和LPF 16进行带限制的基于帧的输入语音信号被发送到自相关计算单元13,17,其中找到自相关数据。 在音调强度/音调滞后计算单元14,18中计算和归一化音调滞后。由高次参数计算单元计算由HPF 12和LPF 16限制的输入语音信号的音调可靠性。 选择单元20使用音调滞后和评估参数从由HPF 12和LPF 16限制的输入语音信号中获得的参数之一进行选择。

    Voice encoding method and apparatus using modified discrete cosine
transform
    8.
    发明授权
    Voice encoding method and apparatus using modified discrete cosine transform 失效
    使用修正离散余弦变换的语音编码方法和装置

    公开(公告)号:US5819212A

    公开(公告)日:1998-10-06

    申请号:US736507

    申请日:1996-10-24

    摘要: A method and apparatus for encoding an input signal, such as a broad-range speech signal, in which a number of decoding operations with different bit rates are enabled for assuring a high encoding bit rate and for minimizing deterioration of the reproduced sound even with a low bit rate. The signal encoding method includes a band-splitting step for splitting an input signal into a number of bands and a step of encoding signals of the bands in a different manner depending on signal characteristics of the bands. Specifically, a low-range side signal is taken out by a low-pass filter from an input signal entering a terminal, and analyzed for Linear Predictive coding by an Linear Predictive coding analysis quantization unit. After finding the Linear Predictive coding residuals, as short-term prediction residuals by an Linear Predictive coding inverted filter, the pitch is found by a pitch analysis circuit. Then, pitch residuals are found by long-term prediction by a pitch inverted filter. The pitch residuals are processed with modified discrete cosine transform by a modified discrete cosine transform (MDCT) circuit and vector-quantized by a vector-quantization circuit. The resulting quantization indices are transmitted along with the pitch lag and the pitch gain. The linear spectral pairs linear spectral pairs are also sent as parameter representing LPC coefficients.

    摘要翻译: 一种用于编码诸如宽范围语音信号的输入信号的方法和装置,其中能够使用不同比特率的多个解码操作用于确保高编码比特率,并且即使使用 低比特率。 信号编码方法包括用于将输入信号分割成多个频带的频带分解步骤和根据频带的信号特性以不同方式编码频带的信号的步骤。 具体地,通过低通滤波器从进入终端的输入信号中取出低范围侧信号,并通过线性预测编码分析量化单元分析线性预测编码。 在找到线性预测编码残差之后,通过线性预测编码反相滤波器作为短期预测残差,音调由音调分析电路找到。 然后,通过音调反向滤波器的长期预测来发现音调残差。 用经修正的离散余弦变换(MDCT)电路,用修正离散余弦变换处理音调残差,并由矢量量化电路进行矢量量化。 产生的量化索引与音调滞后和音调增益一起发送。 线性谱对线性谱对也作为表示LPC系数的参数发送。

    Speech decoding method and apparatus to control the reproduction speed
by changing the number of transform coefficients
    9.
    发明授权
    Speech decoding method and apparatus to control the reproduction speed by changing the number of transform coefficients 失效
    语音解码方法和装置,用于通过改变变换系数的数量来控制再现速度

    公开(公告)号:US5899966A

    公开(公告)日:1999-05-04

    申请号:US736211

    申请日:1996-10-25

    摘要: A signal decoding method and apparatus in which the speech signal reproducing speed is controlled without changing the phoneme or the pitch, in which the apparatus has a data number convertor for converting the number of orthogonal transform coefficients entering a transmission signal input terminal from N to M, an inverse orthogonal transform unit for inverse orthogonal-transforming the M number of the orthogonal transform coefficients obtained by the data number convertor, and a linear predictive coding synthesis filter for performing predictive synthesis based on the short-term prediction residuals obtained by the inverse orthogonal transform unit. For an input signal, short-term prediction residuals are found and are orthogonally transformed to form the orthogonal transform coefficients at a rate of N coefficients per transform unit. The frequency positions of the N transform coefficients may be rearranged to M values by M/N or by oversampling to change N to M. A portable radio terminal embodying the invention is described.

    摘要翻译: 一种信号解码方法和装置,其中在不改变音素或音调的情况下控制语音信号再现速度,其中装置具有数据数转换器,用于将进入发送信号输入端的正交变换系数的数目从N转换为M ,用于对由数据数转换器获得的M个正交变换系数进行逆正交变换的逆正交变换单元和用于基于由逆正交获得的短期预测残差执行预测合成的线性预测编码合成滤波器 变换单元。 对于输入信号,发现短期预测残差并且以每变换单位N个系数的速率进行正交变换以形成正交变换系数。 N个变换系数的频率位置可以通过M / N重排为M个值,或者通过过采样将N改变为M.描述体现本发明的便携式无线电终端。

    Signal band expanding method and apparatus and signal synthesis method and apparatus
    10.
    发明授权
    Signal band expanding method and apparatus and signal synthesis method and apparatus 失效
    信号带扩展方法及装置及信号合成方法及装置

    公开(公告)号:US06539355B1

    公开(公告)日:2003-03-25

    申请号:US09417585

    申请日:1999-10-14

    IPC分类号: G10L1902

    CPC分类号: G10L21/038

    摘要: A bandwidth expanding method and apparatus in which frequency characteristics of high-frequency components of broad band signals can be adjusted to the liking of the user, overflow due to addition is prevented from occurring without power variations being perceived by a user, the number of broad band formants is reduced, and emphasis is attached to the rough structure of the spectrum, so that the produced broad band speech signals can be improved in quality. To this end, in a speech bandwidth expansion device, frequency characteristics of the frequency components not less than 3400 Hz are adjusted by preset alterable parameter values and summed to the original narrow band speech components. If overflow has occurred in a sample, the high-range gain of the sample is lowered to a level below the overflow level before proceeding to addition. Also, broad band autocorrelation &ggr;w is generated and inverse-transformed in an inverse parameter conversion unit to produce broad band linear prediction coefficient &agr;W to synthesize the broad-band speech in a linear predictive coding synthesis unit.

    摘要翻译: 宽带信号的高频分量的频率特性可以根据用户的喜好进行调整的带宽扩展方法和装置,防止由于添加而导致的溢出,而不会由用户感知到功率变化,广泛的数量 频带共振峰减少,重点在于光谱的粗糙结构,从而可以提高产生的宽带语音信号的质量。 为此,在语音带宽扩展装置中,频率分量不小于3400Hz的频率特性通过预设的可变参数值进行调整,并与原始窄带语音分量相加。 如果在样品中发生溢出,则在继续添加之前,将样品的高范围增益降低到低于溢出水平的水平。 此外,在逆参数转换单元中产生宽带自相关法拉姆并逆变换,以产生宽带线性预测系数αW,以在线性预测编码合成单元中合成宽带语音。