Method of noise reduction for speech codecs
    1.
    发明授权
    Method of noise reduction for speech codecs 有权
    语音编解码器降噪方法

    公开(公告)号:US06453289B1

    公开(公告)日:2002-09-17

    申请号:US09361015

    申请日:1999-07-23

    CPC classification number: G10L25/78 G10L21/0208

    Abstract: An improved noise reduction algorithm is provided, as well as a voice activity detector, for use in a voice communication system. The voice activity detector allows for a reliable estimate of noise and enhancement of noise reduction. The noise reduction algorithm and voice activity detector can be implemented integrally in an encoder or applied independently to speech coding application. The voice activity detector employs line spectral frequencies and enhanced input speech which has undergone noise reduction to generate a voice activity flag. The noise reduction algorithm employs a smooth gain function determined from a smoothed noise spectral estimate and smoothed input noisy speech spectra. The gain function is smoothed both across frequency and time in an adaptive manner based on the estimate of the signal-to-noise ratio. The gain function is used for spectral amplitude enhancement to obtain a reduced noise speech signal. Smoothing employs critical frequency bands corresponding to the human auditory system. Swirl reduction is performed to improve overall human perception of decoded speech.

    Abstract translation: 提供了一种改进的降噪算法,以及用于语音通信系统中的语音活动检测器。 语音活动检测器允许噪声的可靠估计和噪声降低的增强。 噪声降低算法和语音活动检测器可以在编码器中一体地实现或独立地应用于语音编码应用。 语音活动检测器采用经过降噪的线谱频率和增强输入语音以产生语音活动标志。 噪声降低算法采用从平滑噪声谱估计和平滑输入噪声语音谱确定的平滑增益函数。 基于信噪比的估计,以自适应方式在频率和时间上平滑增益功能。 增益函数用于频谱振幅增强,以获得降噪噪声语音信号。 平滑采用对应于人类听觉系统的临界频带。 进行旋转减少以改善对解码语音的整体人感知。

    Low data rate speech encoder with mixed excitation
    3.
    发明授权
    Low data rate speech encoder with mixed excitation 失效
    具有混合激励的低数据速率语音编码器

    公开(公告)号:US5668925A

    公开(公告)日:1997-09-16

    申请号:US482322

    申请日:1995-06-01

    CPC classification number: G10L19/06 G10L2025/906

    Abstract: A speech signal has its characteristics extracted and encoded (16), transmitted over a limited-data-rate path (18) and is decoded (20) and synthesized (22) at the receiving end. The characteristics include line spectral frequencies (LSF), pitch and jitter. The LSF are extracted by autoregression, and splitvector quantized (SVQ) in a single frame, and, in parallel, in blocks of two, three and four frames. The SVQ codes have equal length and are evaluated for distortion in conjunction with a threshold. The threshold is varied in such a manner as tend to select for transmission those codewords which maintain a constant data rate into a transmit buffer. A single-bit jitter bit, and encoded pitch value, are product coded with the selected LSF codeword, and all are transmitted over the data path (18) to the receiver. The receiver decodes the characteristics, and controls a pitch generated (1226) in response to the pitch value and a random pitch jitter in response to the jitter bit. Two sets of line spectrum filters receive random noise and the pitch signal, respectively. The filtered signals are modulated by multipliers (1222, 1230) controlled by the LSF codes, and the filtered signals are summed and applied to a final LSF-controlled filter.

    Abstract translation: 语音信号具有提取和编码(16)的特征,在有限数据速率路径(18)上传输,并在接收端解码(20)并合成(22)。 特征包括线频谱(LSF),音调和抖动。 LSF通过自回归和分离向量量化(SVQ)在单个帧中并行并行地以两个,三个和四个帧的块来提取。 SVQ代码具有相等的长度,并且与阈值一起评估失真。 阈值以这样的方式变化,倾向于选择将将恒定数据速率保持在发送缓冲器中的那些码字。 单位抖动位和编码音调值用所选择的LSF码字进行乘积编码,并且全部通过数据路径(18)发送到接收机。 接收机解码特性,并且响应于音调值和响应于抖动位的随机音调抖动来控制产生的音调(1226)。 两组线谱滤波器分别接收随机噪声和音调信号。 经滤波的信号由被LSF码控制的乘法器(1222,1230)调制,滤波后的信号相加并施加到最终的LSF控制的滤波器。

    Speech mode based multi-stage vector quantizer
    5.
    发明授权
    Speech mode based multi-stage vector quantizer 失效
    基于语音模式的多级矢量量化器

    公开(公告)号:US5966688A

    公开(公告)日:1999-10-12

    申请号:US958143

    申请日:1997-10-28

    CPC classification number: G10L19/07 G10L25/93

    Abstract: A speech mode based multi-stage vector quantizer is disclosed which quantizes and encodes line spectral frequency (LSF) vectors that were obtained by transforming the short-term predictor filter coefficients in a speech codec that utilizes linear predictive techniques. The quantizer includes a mode classifier that classifies each speech frame of a speech signal as being associated with one of a voiced, spectrally stationary (Mode A) speech frame, a voiced, spectrally non-stationary (Mode B) speech frame and an unvoiced (Mode C) speech frame. A converter converts each speech frame of the speech signal into an LSF vector and an LSF vector quantizer includes a 12-bit, two-stage, backward predictive vector encoder that encodes the Mode A speech frames and a 22 bit, four-stage backward predictive vector encoder that encodes the Mode 13 and the Mode C speech frames.

    Abstract translation: 公开了一种基于语音模式的多级矢量量化器,其对通过使用线性预测技术的语音编解码器中的短期预测器滤波器系数进行变换而获得的线谱频率(LSF)矢量进行量化和编码。 量化器包括模式分类器,其将语音信号的每个语音帧分类为与有声,频谱平稳(模式A)语音帧,有声,频谱非平稳(模式B)语音帧和无声( 模式C)语音帧。 A转换器将语音信号的每个语音帧转换为LSF向量,并且LSF向量量化器包括对模A语音帧进行编码的12位两级反向预测向量编码器和22位四级后向预测 编码模式13和模式C语音帧的矢量编码器。

    Frequency domain interpolative speech codec system

    公开(公告)号:US06418408B1

    公开(公告)日:2002-07-09

    申请号:US09542792

    申请日:2000-04-04

    Abstract: Encoding of prototype waveform components applicable to GeoMobile and Telephony Earth Station (TES) providing improved voice quality enabling a dual-channel mode of operation which permits more users to communicate over the same physical channel. A prototype word (PW) gain is vector quantized using a vector quantizer (VQ) that explicitly populates the codebook by representative steady state and transient vectors of PW gain for tracking the abrupt variations in speech levels during onsets and other non-stationary events, while maintaining the accuracy of the speech level during stationary conditions. The rapidly evolving waveform (REW) and slowly evolving waveform (SEW) component vectors are converted to magnitude-phase. The variable dimension SEW magnitude vector is quantized using a hierarchical approach, i.e., a fixed dimension SEW mean vector computed by a sub-band averaging of SEW magnitude spectrum, and only the REW magnitude is explicitly encoded. The REW magnitude vector sequence is normalized to unity RMS value, resulting in a REW magnitude shape vector and a REW gain vector. The normalized REW magnitude vectors are modeled by a multi-band sub-band model which converts the variable dimension REW magnitude shape vectors, e.g., six dimensional REW sub-band vectors. The sub-band vectors are averaged over time, resulting in a single average REW sub-band vector for each frame. At the decoder, the full-dimension REW magnitude shape vector is obtained from the REW sub-band vector by a piecewise-constant construction. The REW phase vector is regenerated at the decoder based on the received REW gain vector and the voicing measure, which determines a weighted mixture of SEW component and a random noise that is passed through a high pass filter to generate the REW component. The high pass filter poles are adjusted based on the voicing measure to control the REW component characteristics. At the output the filter, the magnitude of the REW component is scaled to match the received REW magnitude vector.

    Constant data rate speech encoder for limited bandwidth path
    7.
    发明授权
    Constant data rate speech encoder for limited bandwidth path 失效
    用于有限带宽路径的恒定数据速率语音编码器

    公开(公告)号:US5649051A

    公开(公告)日:1997-07-15

    申请号:US486130

    申请日:1995-06-01

    CPC classification number: G10L19/022 H03M7/3082 G10L19/002

    Abstract: A speech signal has its characteristics extracted and encoded (16), transmitted over a limited-data-rate path (18) and is decoded (20) and synthesized (22) at the receiving end. The characteristics include line spectral frequencies (LSF), pitch and jitter. The LSF are extracted by autoregression, and split-vector quantized (SVQ) in a single frame, and, in parallel, in blocks of two, three and four frames. The SVQ codes have equal length and are evaluated for distortion in conjunction with a threshold. The threshold is varied in such a manner as tend to select for transmission those codewords which maintain a constant data rate into a transmit buffer. A single-bit jitter bit, and encoded pitch value, are product coded with the selected LSF codeword, and all are transmitted over the data path (18) to the receiver. The receiver decodes the characteristics, and controls a pitch generated (1226) in response to the pitch value and a random pitch jitter in response to the jitter bit. Two sets of line spectrum filters receive random noise and the pitch signal, respectively. The filtered signals are modulated by multipliers (1222, 1230) controlled by the LSF codes, and the filtered signals are summed and applied to a final LSF-controlled filter.

    Abstract translation: 语音信号具有提取和编码(16)的特征,在有限数据速率路径(18)上传输,并在接收端解码(20)并合成(22)。 特征包括线频谱(LSF),音调和抖动。 LSF通过自回归和单向帧中的矢量量化(SVQ)提取,并行并行地以两个,三个和四个帧的块来提取。 SVQ代码具有相等的长度,并且与阈值一起评估失真。 阈值以这样的方式变化,倾向于选择将将恒定数据速率保持在发送缓冲器中的那些码字。 单位抖动位和编码音调值用所选择的LSF码字进行乘积编码,并且全部通过数据路径(18)发送到接收机。 接收机解码特性,并且响应于音调值和响应于抖动位的随机音调抖动来控制产生的音调(1226)。 两组线谱滤波器分别接收随机噪声和音调信号。 经滤波的信号由被LSF码控制的乘法器(1222,1230)调制,滤波后的信号相加并施加到最终的LSF控制的滤波器。

Patent Agency Ranking