Patent search ap:"Kumar Swaminathan" Page 2

11.

发明授权
Voicing measure for a speech CODEC system 有权
Title translation: 语音CODEC系统的语音测量

公开(公告)号：US07013269B1

公开(公告)日：2006-03-14

申请号：US10073406

申请日：2002-02-13

Applicant: Udaya Bhaskar , Kumar Swaminathan

Inventor： Udaya Bhaskar , Kumar Swaminathan

IPC: G10L19/02

CPC classification number: G10L19/097 , G10L25/93

Abstract: A system and method is provided that employs a frequency domain interpolative CODEC system for low bit rate coding of speech which comprises a linear prediction (LP) front end adapted to process an input signal providing LP parameters which are quantized and encoded over predetermined intervals and used to compute a LP residual signal. An open loop pitch estimator adapted to process the LP residual signal, a pitch quantizer, and a pitch interpolator also provides a pitch contour within the predetermined intervals. A voice activity detector adapted to process the LP parameters and the open loop pitch contour over the predetermined intervals is also provided as well as a signal processor responsive to the LP residual signal and the pitch contour and adapted to perform the following functions: extract a prototype waveform (PW) from the LP residual and the open loop pitch contour for a number of equal sub-intervals within the predetermined invervals; normalize the PW by a gain value of the PW; encode a magnitude of the PW; and provide a voicing measure where the voicing measure characterizes a degree of vocing of the input speech signal and is derived from several input parameters that are correlated to degrees of periodicity of the signal over the predetermined intervals. The voicing measure is provided for the purpose of regenerating a PW phase at a decoder; and providing improved quantization of the PW magnitude at an encoder. The voicing measure is encoded jointly with a PW nonstationarity measure vector using a spectrally weighted vector quantizer having a codebook partioned based on a voiced and unvoiced mode.

Abstract translation: 提供了一种系统和方法，其采用用于语音的低比特率编码的频域内插编解码器系统，其包括线性预测（LP）前端，其适于处理提供经过预定间隔量化和编码的LP参数的输入信号，并使用以计算LP残差信号。适于处理LP残差信号的开环音调估计器，音调量化器和音调内插器也在预定间隔内提供音调轮廓。还提供了适于在预定间隔上处理LP参数和开环音调轮廓的语音活动检测器以及响应于LP残差信号和音调轮廓的信号处理器，并且适于执行以下功能：提取原型来自LP残差的波形（PW）和开环节距轮廓线，用于在预定的反相中的多个相等子间隔; 通过PW的增益值对PW进行归一化; 编码PW的大小; 并且提供发声测量，其中所述发声测量表征所述输入语音信号的声音程度，并且从与所述预定间隔上的所述信号的周期度相关的若干输入参数导出。提供发声措施是为了在解码器处再生PW相; 并且在编码器处提供对PW幅度的改进的量化。发声测量与PW非平稳测量向量一起编码，其使用具有基于有声和无声模式分组的码本的频谱加权矢量量化器。

12.

发明申请
Stitching of video for continuous presence multipoint video conferencing 审中-公开
Title translation: 用于连续存在多点视频会议的视频拼接

公开(公告)号：US20050008240A1

公开(公告)日：2005-01-13

申请号：US10836672

申请日：2004-04-30

Applicant: Ashish Banerji , Kannan Panchapakesan , Kumar Swaminathan

Inventor： Ashish Banerji , Kannan Panchapakesan , Kumar Swaminathan

IPC: G06K9/46 , H04N5/235 , H04N5/262 , H04N13/02 , H04N19/89 , H04N19/895

CPC classification number: H04N7/15 , H04N5/2624 , H04N19/40 , H04N19/46 , H04N19/467 , H04N19/573 , H04N19/65 , H04N19/70 , H04N19/89 , H04N19/895

Abstract: A drift-free hybrid method of performing video stitching is provided. The method includes decoding a plurality of video bitstreams and storing prediction information. The decoded bitstreams form video images, spatially composed into a combined image. The image comprises frames of ideal stitched video sequence. The method uses prediction information in conjunction with previously generated frames to predict pixel blocks in the next frame. A stitched predicted block in the next frame is subtracted from a corresponding block in a corresponding frame to create a stitched raw residual block. The raw residual block is forward transformed, quantized, entropy encoded and added to the stitched video bitstream along with the prediction information. Also, the stitched raw residual block is inverse transformed and dequantized to create a stitched decoded residual block. The residual block is added to the predicted block to generate the stitched reconstructed block in the next frame of the sequence.

Abstract translation: 提供了一种执行视频拼接的无漂移混合方法。该方法包括解码多个视频位流并存储预测信息。解码的比特流形成视频图像，空间地组成组合图像。该图像包括理想的拼接视频序列的帧。该方法结合先前生成的帧使用预测信息来预测下一帧中的像素块。从相应帧中的对应块中减去下一帧中的拼接预测块，以创建缝合的原始残留块。将原始残留块与预测信息一起进行前向变换，量化，熵编码并添加到拼接视频比特流中。此外，缝合的原始残留块被逆变换和去量化以产生缝合解码的残余块。将残余块添加到预测块以在序列的下一帧中产生缝合的重构块。

13.

发明授权
Frequency domain interpolative speech codec system 有权

公开(公告)号：US06418408B1

公开(公告)日：2002-07-09

申请号：US09542792

申请日：2000-04-04

Applicant: Bangalore R. Udaya Bhaskar , Srinivas Nandkumar , Kumar Swaminathan , Gaguk Zakaria

Inventor： Bangalore R. Udaya Bhaskar , Srinivas Nandkumar , Kumar Swaminathan , Gaguk Zakaria

IPC: G10L1904

CPC classification number: G10L19/18 , G10L19/005 , G10L19/02 , G10L19/0204 , G10L19/04 , G10L19/083 , G10L19/09 , G10L25/27 , G10L25/30 , G10L25/78 , G10L25/90 , G10L2019/0012 , G10L2025/783

Abstract: Encoding of prototype waveform components applicable to GeoMobile and Telephony Earth Station (TES) providing improved voice quality enabling a dual-channel mode of operation which permits more users to communicate over the same physical channel. A prototype word (PW) gain is vector quantized using a vector quantizer (VQ) that explicitly populates the codebook by representative steady state and transient vectors of PW gain for tracking the abrupt variations in speech levels during onsets and other non-stationary events, while maintaining the accuracy of the speech level during stationary conditions. The rapidly evolving waveform (REW) and slowly evolving waveform (SEW) component vectors are converted to magnitude-phase. The variable dimension SEW magnitude vector is quantized using a hierarchical approach, i.e., a fixed dimension SEW mean vector computed by a sub-band averaging of SEW magnitude spectrum, and only the REW magnitude is explicitly encoded. The REW magnitude vector sequence is normalized to unity RMS value, resulting in a REW magnitude shape vector and a REW gain vector. The normalized REW magnitude vectors are modeled by a multi-band sub-band model which converts the variable dimension REW magnitude shape vectors, e.g., six dimensional REW sub-band vectors. The sub-band vectors are averaged over time, resulting in a single average REW sub-band vector for each frame. At the decoder, the full-dimension REW magnitude shape vector is obtained from the REW sub-band vector by a piecewise-constant construction. The REW phase vector is regenerated at the decoder based on the received REW gain vector and the voicing measure, which determines a weighted mixture of SEW component and a random noise that is passed through a high pass filter to generate the REW component. The high pass filter poles are adjusted based on the voicing measure to control the REW component characteristics. At the output the filter, the magnitude of the REW component is scaled to match the received REW magnitude vector.

14.

发明授权
Method of and apparatus for generating auxiliary information for expediting sparse codebook search 失效
Title translation: 用于产生辅助信息的方法和装置用于进行稀疏的代码簿搜索

公开(公告)号：US5195137A

公开(公告)日：1993-03-16

申请号：US646122

申请日：1991-01-28

Applicant: Kumar Swaminathan

Inventor： Kumar Swaminathan

IPC: G06T9/00 , H03M7/30

CPC classification number: G06T9/008 , H03M7/3082

Abstract: In many applications involving the coding and processing of speech signals the relevant applicable codebook is one which may be termed a sparse codebook. That is, the majority of elements in the codebook are zero valued. The searching of such a sparse codebook is accelerated in accord with the present invention by generating auxiliary information defining the sparse nature of the codebok and using this information to assist and speed up searches of the codebook.In a particular method of searching the calculation of the distance between a target vector and a stored codebook vector is enhanced by use of a distortion metric derived from energy terms and correlation terms of the codebook entries. Calculation of these energy and correlation terms is speeded up by exploiting the sparseness of the codebook entries. The non-zero elements (NZE) of the space codebook are each identified and are defined by their offset from a reference point.

15.

发明授权
Device and method for communicating in a mobile satellite system 失效
Title translation: 用于在移动卫星系统中进行通信的装置和方法

公开(公告)号：US5781540A

公开(公告)日：1998-07-14

申请号：US497582

申请日：1995-06-30

Applicant: James Eryx Malcolm , Daniel Fraley , Adrian Morris , David Roos , Kumar Swaminathan , Seok Ho Kim , Robert Carroll Marquart

Inventor： James Eryx Malcolm , Daniel Fraley , Adrian Morris , David Roos , Kumar Swaminathan , Seok Ho Kim , Robert Carroll Marquart

IPC: H04B7/185 , H04J3/06 , H04L7/04 , H04L7/10 , H04B7/212

CPC classification number: H04J3/0605 , H04B7/18532 , H04L7/046 , H04L7/10

Abstract: The present invention relates to a device and a method for communicating in a mobile communication system. The method provides a carrier signal having a plurality of frames. Each frame has a plurality of time slots, and each time slot comprises a plurality of transmission bits. A group of time slots are assigned to a communication channel. A traffic burst signal having a plurality of traffic symbols is transmitted over the communication channel by transmitting a first preamble over one of the assigned time slots, and transmitting a second preamble and at least one of the traffic symbols over at least one of the other assigned time slots. The second preamble occupies fewer transmission bits than the first preamble. The apparatus for transmitting a telephony signal over an RF channel includes a modem receiving a digitized PCM telephony signal and producing a traffic burst signal, and a transmitting unit in communication with the modem for transmitting a FDMA/TDMA signal carrying a plurality of traffic burst signals. At least one of the traffic burst signals carries a limited preamble message including a header field and a unique word field and at least one digitized voice message associated with a telephone call. Another traffic burst signal carries at least one signal acquisition message including a unique word field.

Abstract translation: 本发明涉及用于在移动通信系统中进行通信的装置和方法。该方法提供具有多个帧的载波信号。每个帧具有多个时隙，并且每个时隙包括多个传输比特。一组时隙被分配给通信信道。通过在所分配的时隙中的一个发送第一前同步码，通过通信信道发送具有多个业务符号的业务突发信号，并且通过至少一个其他分配的业务符号发送第二前导码和至少一个业务符号时隙。第二前导码占用比第一前同步码少的传输比特。用于通过RF信道发送电话信号的装置包括调制解调器，接收数字化的PCM电话信号并产生业务脉冲串信号，以及与调制解调器通信的发送单元，用于发送携带多个话务脉冲串信号的FDMA / TDMA信号。业务突发信号中的至少一个携带有限的前导消息，其包括报头字段和唯一字字段以及与电话呼叫相关联的至少一个数字化语音消息。另一业务脉冲串信号携带至少一个信号获取消息，其包括唯一的字字段。

16.

发明授权
Low rate multi-mode CELP codec that encodes line SPECTRAL frequencies utilizing an offset 失效
Title translation: 低速多模CELP编解码器，利用偏移编码线路频谱频率

公开(公告)号：US5751903A

公开(公告)日：1998-05-12

申请号：US359116

申请日：1994-12-19

Applicant: Kumar Swaminathan , Murthy Vemuganti

Inventor： Kumar Swaminathan , Murthy Vemuganti

IPC: G10L11/06 , G10L19/00 , G10L19/14 , G10L9/18

CPC classification number: G10L19/06 , G10L19/18 , G10L25/24 , G10L25/93

Abstract: The present invention provides a multi-mode CELP encoding and decoding method and device for digitized speech signals providing improvements over prior art codecs and coding methods by selectively utilizes backward prediction for the short-term predictor parameters and fixed codebook gain of a speech signal. In order to achieve these improvements, the present invention provides a coding method comprising the steps of classifying a segment of the digitized speech signal as one of a plurality of predetermined modes, determining a set of unquantized line spectral frequencies to represent the short term predictor parameters for that segment, and quantizing the determined set of unquantized line spectral frequencies using a mode-specific combination of scalar quantization and vector quantization, which utilizes backward prediction for modes with voiced speech signals. Furthermore, backward prediction is selectively applied to the fixed codebook gain in the modes that are free of transients so that it may be used in the fixed codebook search and fixed codebook gain quantization in those modes.

Abstract translation: 本发明提供了一种用于数字化语音信号的多模式CELP编码和解码方法和装置，其通过选择性地利用对短期预测参数的反向预测和语音信号的固定码本增益来提供超过现有技术编解码器和编码方法的改进。为了实现这些改进，本发明提供了一种编码方法，包括以下步骤：将数字化语音信号的片段分类为多个预定模式之一，确定一组未量化的线谱频率以表示短期预测参数并且使用标量量化和矢量量化的模式特定组合量化所确定的未量化线谱频率的集合，其利用具有有声语音信号的模式的反向预测。此外，在没有瞬变的模式中，有选择地将后向预测应用于固定码本增益，使得其可以用于这些模式中的固定码本搜索和固定码本增益量化。

17.

发明授权
In-band transmission of TTY/TTD signals for systems employing low bit-rate voice compression 有权
Title translation: 对采用低比特率语音压缩的系统进行TTY / TTD信号的带内传输

公开(公告)号：US06961320B1

公开(公告)日：2005-11-01

申请号：US09669283

申请日：2000-09-26

Applicant: Kumar Swaminathan , Udaya Bhaskar

Inventor： Kumar Swaminathan , Udaya Bhaskar

IPC: H04L5/22 , H04M3/42 , H04M11/00 , H04M11/06

CPC classification number: H04M3/42391 , H04M11/066 , H04M2207/18

Abstract: A method, system, and software product for transmitting TTY/TDD signals in a system employing low bit-rate voice compression are disclosed. The method includes receiving an input signal and generating a teletypewriter (TTY) indicator signal from the input signal. Whether or not the input signal is a TTY signal including a TTY character, is determined based on the TTY indicator signal. A TTY packet including the TTY character of the TTY signal is constructed and transmitted if the input signal is determined to be a TTY signal. A method, system, and software product for receiving and decoding TTY/TDD signal is also disclosed.

Abstract translation: 公开了一种在采用低比特率语音压缩的系统中传输TTY / TDD信号的方法，系统和软件产品。该方法包括接收输入信号并从输入信号产生电传打字机（TTY）指示符信号。是否基于TTY指示符信号确定输入信号是否包括TTY字符的TTY信号。如果输入信号被确定为TTY信号，则构造并发送包括TTY信号的TTY字符的TTY分组。还公开了用于接收和解码TTY / TDD信号的方法，系统和软件产品。

18.

发明授权
Spectral magnitude modeling and quantization in a frequency domain interpolative speech codec system 有权
Title translation: 频域内插语音编解码系统中的频谱幅度建模和量化

公开(公告)号：US06493664B1

公开(公告)日：2002-12-10

申请号：US09542793

申请日：2000-04-04

Applicant: Bangalore R. Udaya Bhaskar , Srinivas Nandkumar , Kumar Swaminathan , Gaguk Zakaria

Inventor： Bangalore R. Udaya Bhaskar , Srinivas Nandkumar , Kumar Swaminathan , Gaguk Zakaria

IPC: G10L1912

CPC classification number: G10L19/18 , G10L19/005 , G10L19/02 , G10L19/0204 , G10L19/04 , G10L19/083 , G10L19/09 , G10L25/27 , G10L25/30 , G10L25/78 , G10L25/90 , G10L2019/0012 , G10L2025/783

Abstract: Encoding of prototype waveform components applicable to telecommunication systems provides improved voice quality enabling a dual-channel mode of operation which permits more users to communicate over the same physical channel. A prototype word (PW) gain is vector quantized using a vector quantizer (VQ) that explicitly populates a codebook by representative steady state and transient vectors of PW gain for tracking the abrupt variations in speech levels during onsets and other non-stationary events, while maintaining the accuracy of the speech level during stationary conditions.

Abstract translation: 适用于电信系统的原型波形组件的编码提供改进的语音质量，从而实现双通道操作模式，允许更多的用户通过相同的物理信道进行通信。原型字（PW）增益使用矢量量化器（VQ）进行矢量量化，矢量量化器（VQ）通过代表性的稳定状态和PW增益的瞬态矢量明确地填充码本，用于跟踪在开始和其他非平稳事件期间语音电平的突然变化，而在静止状态下保持语音水平的准确性。

19.

发明授权
Parallel/pipeline VLSI architecture for a low-delay CELP coder/decoder 有权
Title translation: 用于低延迟CELP编码器/解码器的并行/流水线VLSI架构

公开(公告)号：US06314393B1

公开(公告)日：2001-11-06

申请号：US09270918

申请日：1999-03-16

Applicant: Yue-Peng Zheng , Shvetal K. Patel , Kumar Swaminathan

Inventor： Yue-Peng Zheng , Shvetal K. Patel , Kumar Swaminathan

IPC: G10L1912

CPC classification number: G10L19/16 , G10L19/12

Abstract: An integrated circuit for processing a speech signal in accordance with a CELP standard includes a plurality of processing elements coupled to a data bus in parallel. Each processing element includes a multiplier and an accumulator. The integrated circuit further includes an auxiliary processing element, which is also coupled to the data bus and has a division unit and a comparator. The plurality of processing elements and the auxiliary processing element are also coupled in a pipeline formation.

Abstract translation: 用于根据CELP标准处理语音信号的集成电路包括并行耦合到数据总线的多个处理单元。每个处理元件包括一个乘法器和一个累加器。集成电路还包括辅助处理元件，其还耦合到数据总线，并具有除法单元和比较器。多个处理元件和辅助处理元件也以管道结构耦合。

20.

发明授权
Speech mode based multi-stage vector quantizer 失效
Title translation: 基于语音模式的多级矢量量化器

公开(公告)号：US5966688A

公开(公告)日：1999-10-12

申请号：US958143

申请日：1997-10-28

Applicant: Srinivas Nandkumar , Kumar Swaminathan

Inventor： Srinivas Nandkumar , Kumar Swaminathan

IPC: G10L11/06 , G10L19/00 , G10L19/06 , G10L3/02

CPC classification number: G10L19/07 , G10L25/93

Abstract: A speech mode based multi-stage vector quantizer is disclosed which quantizes and encodes line spectral frequency (LSF) vectors that were obtained by transforming the short-term predictor filter coefficients in a speech codec that utilizes linear predictive techniques. The quantizer includes a mode classifier that classifies each speech frame of a speech signal as being associated with one of a voiced, spectrally stationary (Mode A) speech frame, a voiced, spectrally non-stationary (Mode B) speech frame and an unvoiced (Mode C) speech frame. A converter converts each speech frame of the speech signal into an LSF vector and an LSF vector quantizer includes a 12-bit, two-stage, backward predictive vector encoder that encodes the Mode A speech frames and a 22 bit, four-stage backward predictive vector encoder that encodes the Mode 13 and the Mode C speech frames.

Abstract translation: 公开了一种基于语音模式的多级矢量量化器，其对通过使用线性预测技术的语音编解码器中的短期预测器滤波器系数进行变换而获得的线谱频率（LSF）矢量进行量化和编码。量化器包括模式分类器，其将语音信号的每个语音帧分类为与有声，频谱平稳（模式A）语音帧，有声，频谱非平稳（模式B）语音帧和无声（模式C）语音帧。 A转换器将语音信号的每个语音帧转换为LSF向量，并且LSF向量量化器包括对模A语音帧进行编码的12位两级反向预测向量编码器和22位四级后向预测编码模式13和模式C语音帧的矢量编码器。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification