Patent search ap:"Srinivas Nandkumar" Page 1

1.

发明授权
Method of noise reduction for speech codecs 有权
Title translation: 语音编解码器降噪方法

公开(公告)号：US06453289B1

公开(公告)日：2002-09-17

申请号：US09361015

申请日：1999-07-23

Applicant: Filiz Basbug Ertem , Srinivas Nandkumar , Kumar Swaminathan

Inventor： Filiz Basbug Ertem , Srinivas Nandkumar , Kumar Swaminathan

IPC: G10L2102

CPC classification number: G10L25/78 , G10L21/0208

Abstract: An improved noise reduction algorithm is provided, as well as a voice activity detector, for use in a voice communication system. The voice activity detector allows for a reliable estimate of noise and enhancement of noise reduction. The noise reduction algorithm and voice activity detector can be implemented integrally in an encoder or applied independently to speech coding application. The voice activity detector employs line spectral frequencies and enhanced input speech which has undergone noise reduction to generate a voice activity flag. The noise reduction algorithm employs a smooth gain function determined from a smoothed noise spectral estimate and smoothed input noisy speech spectra. The gain function is smoothed both across frequency and time in an adaptive manner based on the estimate of the signal-to-noise ratio. The gain function is used for spectral amplitude enhancement to obtain a reduced noise speech signal. Smoothing employs critical frequency bands corresponding to the human auditory system. Swirl reduction is performed to improve overall human perception of decoded speech.

Abstract translation: 提供了一种改进的降噪算法，以及用于语音通信系统中的语音活动检测器。语音活动检测器允许噪声的可靠估计和噪声降低的增强。噪声降低算法和语音活动检测器可以在编码器中一体地实现或独立地应用于语音编码应用。语音活动检测器采用经过降噪的线谱频率和增强输入语音以产生语音活动标志。噪声降低算法采用从平滑噪声谱估计和平滑输入噪声语音谱确定的平滑增益函数。基于信噪比的估计，以自适应方式在频率和时间上平滑增益功能。增益函数用于频谱振幅增强，以获得降噪噪声语音信号。平滑采用对应于人类听觉系统的临界频带。进行旋转减少以改善对解码语音的整体人感知。

2.

发明授权
Voicing measure as an estimate of signal periodicity for a frequency domain interpolative speech codec system 有权
Title translation: 音频测量作为频域内插语音编解码系统的信号周期的估计

公开(公告)号：US06691092B1

公开(公告)日：2004-02-10

申请号：US09542390

申请日：2000-04-04

Applicant: Bangalore R. Udaya Bhaskar , Srinivas Nandkumar , Kumar Swaminathan , Gaguk Zakaria

Inventor： Bangalore R. Udaya Bhaskar , Srinivas Nandkumar , Kumar Swaminathan , Gaguk Zakaria

IPC: G10L1304

CPC classification number: G10L19/097 , G10L2025/783

Abstract: A system determines a voicing measure as a measure of the degree of signal periodicity and uses the determined voicing measure to quantize the spectral magnitude of the slowly evolving waveform (SEW) and the modeling of the SEW and rapidly evolving waveform (REW) phase spectra.

Abstract translation: 系统将发声测量值确定为信号周期程度的度量，并使用确定的发声测量来量化慢速演化波形（SEW）的频谱幅度以及SEW和快速演变波形（REW）相位谱的建模。

3.

发明授权
Low data rate speech encoder with mixed excitation 失效
Title translation: 具有混合激励的低数据速率语音编码器

公开(公告)号：US5668925A

公开(公告)日：1997-09-16

申请号：US482322

申请日：1995-06-01

Applicant: Joseph Harvey Rothweiler , John Charles Carmody , Srinivas Nandkumar

Inventor： Joseph Harvey Rothweiler , John Charles Carmody , Srinivas Nandkumar

IPC: G10L11/04 , G10L19/06 , G10L9/00

CPC classification number: G10L19/06 , G10L2025/906

Abstract: A speech signal has its characteristics extracted and encoded (16), transmitted over a limited-data-rate path (18) and is decoded (20) and synthesized (22) at the receiving end. The characteristics include line spectral frequencies (LSF), pitch and jitter. The LSF are extracted by autoregression, and splitvector quantized (SVQ) in a single frame, and, in parallel, in blocks of two, three and four frames. The SVQ codes have equal length and are evaluated for distortion in conjunction with a threshold. The threshold is varied in such a manner as tend to select for transmission those codewords which maintain a constant data rate into a transmit buffer. A single-bit jitter bit, and encoded pitch value, are product coded with the selected LSF codeword, and all are transmitted over the data path (18) to the receiver. The receiver decodes the characteristics, and controls a pitch generated (1226) in response to the pitch value and a random pitch jitter in response to the jitter bit. Two sets of line spectrum filters receive random noise and the pitch signal, respectively. The filtered signals are modulated by multipliers (1222, 1230) controlled by the LSF codes, and the filtered signals are summed and applied to a final LSF-controlled filter.

Abstract translation: 语音信号具有提取和编码（16）的特征，在有限数据速率路径（18）上传输，并在接收端解码（20）并合成（22）。特征包括线频谱（LSF），音调和抖动。 LSF通过自回归和分离向量量化（SVQ）在单个帧中并行并行地以两个，三个和四个帧的块来提取。 SVQ代码具有相等的长度，并且与阈值一起评估失真。阈值以这样的方式变化，倾向于选择将将恒定数据速率保持在发送缓冲器中的那些码字。单位抖动位和编码音调值用所选择的LSF码字进行乘积编码，并且全部通过数据路径（18）发送到接收机。接收机解码特性，并且响应于音调值和响应于抖动位的随机音调抖动来控制产生的音调（1226）。两组线谱滤波器分别接收随机噪声和音调信号。经滤波的信号由被LSF码控制的乘法器（1222,1230）调制，滤波后的信号相加并施加到最终的LSF控制的滤波器。

4.

发明授权
Spectral magnitude modeling and quantization in a frequency domain interpolative speech codec system 有权
Title translation: 频域内插语音编解码系统中的频谱幅度建模和量化

公开(公告)号：US06493664B1

公开(公告)日：2002-12-10

申请号：US09542793

申请日：2000-04-04

Applicant: Bangalore R. Udaya Bhaskar , Srinivas Nandkumar , Kumar Swaminathan , Gaguk Zakaria

Inventor： Bangalore R. Udaya Bhaskar , Srinivas Nandkumar , Kumar Swaminathan , Gaguk Zakaria

IPC: G10L1912

CPC classification number: G10L19/18 , G10L19/005 , G10L19/02 , G10L19/0204 , G10L19/04 , G10L19/083 , G10L19/09 , G10L25/27 , G10L25/30 , G10L25/78 , G10L25/90 , G10L2019/0012 , G10L2025/783

Abstract: Encoding of prototype waveform components applicable to telecommunication systems provides improved voice quality enabling a dual-channel mode of operation which permits more users to communicate over the same physical channel. A prototype word (PW) gain is vector quantized using a vector quantizer (VQ) that explicitly populates a codebook by representative steady state and transient vectors of PW gain for tracking the abrupt variations in speech levels during onsets and other non-stationary events, while maintaining the accuracy of the speech level during stationary conditions.

Abstract translation: 适用于电信系统的原型波形组件的编码提供改进的语音质量，从而实现双通道操作模式，允许更多的用户通过相同的物理信道进行通信。原型字（PW）增益使用矢量量化器（VQ）进行矢量量化，矢量量化器（VQ）通过代表性的稳定状态和PW增益的瞬态矢量明确地填充码本，用于跟踪在开始和其他非平稳事件期间语音电平的突然变化，而在静止状态下保持语音水平的准确性。

5.

发明授权
Speech mode based multi-stage vector quantizer 失效
Title translation: 基于语音模式的多级矢量量化器

公开(公告)号：US5966688A

公开(公告)日：1999-10-12

申请号：US958143

申请日：1997-10-28

Applicant: Srinivas Nandkumar , Kumar Swaminathan

Inventor： Srinivas Nandkumar , Kumar Swaminathan

IPC: G10L11/06 , G10L19/00 , G10L19/06 , G10L3/02

CPC classification number: G10L19/07 , G10L25/93

Abstract: A speech mode based multi-stage vector quantizer is disclosed which quantizes and encodes line spectral frequency (LSF) vectors that were obtained by transforming the short-term predictor filter coefficients in a speech codec that utilizes linear predictive techniques. The quantizer includes a mode classifier that classifies each speech frame of a speech signal as being associated with one of a voiced, spectrally stationary (Mode A) speech frame, a voiced, spectrally non-stationary (Mode B) speech frame and an unvoiced (Mode C) speech frame. A converter converts each speech frame of the speech signal into an LSF vector and an LSF vector quantizer includes a 12-bit, two-stage, backward predictive vector encoder that encodes the Mode A speech frames and a 22 bit, four-stage backward predictive vector encoder that encodes the Mode 13 and the Mode C speech frames.

Abstract translation: 公开了一种基于语音模式的多级矢量量化器，其对通过使用线性预测技术的语音编解码器中的短期预测器滤波器系数进行变换而获得的线谱频率（LSF）矢量进行量化和编码。量化器包括模式分类器，其将语音信号的每个语音帧分类为与有声，频谱平稳（模式A）语音帧，有声，频谱非平稳（模式B）语音帧和无声（模式C）语音帧。 A转换器将语音信号的每个语音帧转换为LSF向量，并且LSF向量量化器包括对模A语音帧进行编码的12位两级反向预测向量编码器和22位四级后向预测编码模式13和模式C语音帧的矢量编码器。

6.

发明授权
Frequency domain interpolative speech codec system 有权

公开(公告)号：US06418408B1

公开(公告)日：2002-07-09

申请号：US09542792

申请日：2000-04-04

Applicant: Bangalore R. Udaya Bhaskar , Srinivas Nandkumar , Kumar Swaminathan , Gaguk Zakaria

Inventor： Bangalore R. Udaya Bhaskar , Srinivas Nandkumar , Kumar Swaminathan , Gaguk Zakaria

IPC: G10L1904

CPC classification number: G10L19/18 , G10L19/005 , G10L19/02 , G10L19/0204 , G10L19/04 , G10L19/083 , G10L19/09 , G10L25/27 , G10L25/30 , G10L25/78 , G10L25/90 , G10L2019/0012 , G10L2025/783

Abstract: Encoding of prototype waveform components applicable to GeoMobile and Telephony Earth Station (TES) providing improved voice quality enabling a dual-channel mode of operation which permits more users to communicate over the same physical channel. A prototype word (PW) gain is vector quantized using a vector quantizer (VQ) that explicitly populates the codebook by representative steady state and transient vectors of PW gain for tracking the abrupt variations in speech levels during onsets and other non-stationary events, while maintaining the accuracy of the speech level during stationary conditions. The rapidly evolving waveform (REW) and slowly evolving waveform (SEW) component vectors are converted to magnitude-phase. The variable dimension SEW magnitude vector is quantized using a hierarchical approach, i.e., a fixed dimension SEW mean vector computed by a sub-band averaging of SEW magnitude spectrum, and only the REW magnitude is explicitly encoded. The REW magnitude vector sequence is normalized to unity RMS value, resulting in a REW magnitude shape vector and a REW gain vector. The normalized REW magnitude vectors are modeled by a multi-band sub-band model which converts the variable dimension REW magnitude shape vectors, e.g., six dimensional REW sub-band vectors. The sub-band vectors are averaged over time, resulting in a single average REW sub-band vector for each frame. At the decoder, the full-dimension REW magnitude shape vector is obtained from the REW sub-band vector by a piecewise-constant construction. The REW phase vector is regenerated at the decoder based on the received REW gain vector and the voicing measure, which determines a weighted mixture of SEW component and a random noise that is passed through a high pass filter to generate the REW component. The high pass filter poles are adjusted based on the voicing measure to control the REW component characteristics. At the output the filter, the magnitude of the REW component is scaled to match the received REW magnitude vector.

7.

发明授权
Constant data rate speech encoder for limited bandwidth path 失效
Title translation: 用于有限带宽路径的恒定数据速率语音编码器

公开(公告)号：US5649051A

公开(公告)日：1997-07-15

申请号：US486130

申请日：1995-06-01

Applicant: Joseph Harvey Rothweiler , John Charles Carmody , Srinivas Nandkumar

Inventor： Joseph Harvey Rothweiler , John Charles Carmody , Srinivas Nandkumar

IPC: G10L19/00 , G10L19/14 , H03M7/30 , G10L9/00

CPC classification number: G10L19/022 , H03M7/3082 , G10L19/002

Abstract: A speech signal has its characteristics extracted and encoded (16), transmitted over a limited-data-rate path (18) and is decoded (20) and synthesized (22) at the receiving end. The characteristics include line spectral frequencies (LSF), pitch and jitter. The LSF are extracted by autoregression, and split-vector quantized (SVQ) in a single frame, and, in parallel, in blocks of two, three and four frames. The SVQ codes have equal length and are evaluated for distortion in conjunction with a threshold. The threshold is varied in such a manner as tend to select for transmission those codewords which maintain a constant data rate into a transmit buffer. A single-bit jitter bit, and encoded pitch value, are product coded with the selected LSF codeword, and all are transmitted over the data path (18) to the receiver. The receiver decodes the characteristics, and controls a pitch generated (1226) in response to the pitch value and a random pitch jitter in response to the jitter bit. Two sets of line spectrum filters receive random noise and the pitch signal, respectively. The filtered signals are modulated by multipliers (1222, 1230) controlled by the LSF codes, and the filtered signals are summed and applied to a final LSF-controlled filter.

Abstract translation: 语音信号具有提取和编码（16）的特征，在有限数据速率路径（18）上传输，并在接收端解码（20）并合成（22）。特征包括线频谱（LSF），音调和抖动。 LSF通过自回归和单向帧中的矢量量化（SVQ）提取，并行并行地以两个，三个和四个帧的块来提取。 SVQ代码具有相等的长度，并且与阈值一起评估失真。阈值以这样的方式变化，倾向于选择将将恒定数据速率保持在发送缓冲器中的那些码字。单位抖动位和编码音调值用所选择的LSF码字进行乘积编码，并且全部通过数据路径（18）发送到接收机。接收机解码特性，并且响应于音调值和响应于抖动位的随机音调抖动来控制产生的音调（1226）。两组线谱滤波器分别接收随机噪声和音调信号。经滤波的信号由被LSF码控制的乘法器（1222,1230）调制，滤波后的信号相加并施加到最终的LSF控制的滤波器。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification