Low-complexity encoding/decoding of quantized MDCT spectrum in scalable speech and audio codecs
    21.
    发明申请
    Low-complexity encoding/decoding of quantized MDCT spectrum in scalable speech and audio codecs 有权
    可扩展语音和音频编解码器中量化MDCT频谱的低复杂度编码/解码

    公开(公告)号:US20090234644A1

    公开(公告)日:2009-09-17

    申请号:US12255604

    申请日:2008-10-21

    CPC classification number: G10L19/24 G10L19/038

    Abstract: A scalable speech and audio codec is provided that implements combinatorial spectrum encoding. A residual signal is obtained from a Code Excited Linear Prediction (CELP)-based encoding layer, where the residual signal is a difference between an original audio signal and a reconstructed version of the original audio signal. The residual signal is transformed at a Discrete Cosine Transform (DCT)-type transform layer to obtain a corresponding transform spectrum having a plurality of spectral lines. The transform spectrum spectral lines are transformed using a combinatorial position coding technique.The combinatorial position coding technique includes generating a lexicographical index for a selected subset of spectral lines, where each lexicographic index represents one of a plurality of possible binary strings representing the positions of the selected subset of spectral lines. The lexicographical index represents non-zero spectral lines in a binary string in fewer bits than the length of the binary string.

    Abstract translation: 提供可实现组合频谱编码的可扩展语音和音频编解码器。 从基于码激励线性预测(CELP)的编码层获得残留信号,其中残留信号是原始音频信号和原始音频信号的重建版本之间的差异。 残差信号在离散余弦变换(DCT)型变换层处变换,以获得具有多个谱线的对应变换频谱。 使用组合位置编码技术对变换频谱谱线进行变换。 组合位置编码技术包括为选定的谱线子集生成词典索引,其中每个词典索引表示表示所选择的谱线子集的位置的多个可能的二进制串中的一个。 字典索引表示二进制串中的非零谱线,比二进制串的长度少。

    Method and apparatus for high performance low bit-rate coding of unvoiced speech

    公开(公告)号:US20050143980A1

    公开(公告)日:2005-06-30

    申请号:US11066356

    申请日:2005-02-24

    Applicant: Pengjun Huang

    Inventor: Pengjun Huang

    CPC classification number: G10L19/12 G10L19/083 G10L19/18 G10L25/93

    Abstract: A low-bit-rate coding technique for unvoiced segments of speech, without loss of quality compared to the conventional Code Excited Linear Prediction (CELP) method operating at a much higher bit rate. A set of gains are derived from a residual signal after whitening the speech signal by a linear prediction filter. These gains are then quantized and applied to a randomly generated sparse excitation. The excitation is filtered, and its spectral characteristics are analyzed and compared to the spectral characteristics of the original residual signal. Based on this analysis, a filter is chosen to shape the spectral characteristics of the excitation to achieve optimal performance.

    Method and apparatus for improved detection of rate errors in variable rate receivers
    27.
    发明授权
    Method and apparatus for improved detection of rate errors in variable rate receivers 有权
    用于改进可变速率接收机中速率误差检测的方法和装置

    公开(公告)号:US08243695B2

    公开(公告)日:2012-08-14

    申请号:US12537906

    申请日:2009-08-07

    CPC classification number: H04L1/08 H04L1/0046 H04L1/201

    Abstract: A system and method for detection of rate determination algorithm errors in variable rate communications system receivers. The disclosed embodiments prevent rate determination algorithm errors from causing audible artifacts such as screeches or beeps. The disclosed system and method detects frames with incorrectly determined data rates and performs frame erasure processing and/or memory state clean up to prevent propagation of distortion across multiple frames. Frames with incorrectly determined data rates are detected by checking illegal rate transitions, reserved bits, validating unused filter type bit combinations and analyzing relationships between fixed code-book gains and linear prediction coefficient gains.

    Abstract translation: 一种用于在可变速率通信系统接收机中检测速率确定算法错误的系统和方法。 所公开的实施例防止速率确定算法错误引起可听见的伪影,例如吱吱声或嘟嘟声。 所公开的系统和方法检测具有错误确定的数据速率的帧,并执行帧擦除处理和/或存储器状态清理,以防止跨多个帧的失真传播。 通过检查非法速率转换,保留位,验证未使用的过滤器类型位组合以及分析固定代码簿增益和线性预测系数增益之间的关系来检测具有不正确确定的数据速率的帧。

    METHOD AND APPARATUS FOR VECTOR QUANTIZATION CODEBOOK SEARCH
    28.
    发明申请
    METHOD AND APPARATUS FOR VECTOR QUANTIZATION CODEBOOK SEARCH 审中-公开
    用于矢量量化的方法和装置代码搜索

    公开(公告)号:US20100174539A1

    公开(公告)日:2010-07-08

    申请号:US12349327

    申请日:2009-01-06

    CPC classification number: G10L19/038

    Abstract: A vector quantization codebook search method and apparatus use support vector machines (“SVMs”) to compute a hyperplane, where the hyperplane is used to separate codebook elements into a plurality of bins. During execution, a controller determines which of the plurality of bins contains a desired codebook element, and then searches the determined bin. Codebook search complexity is reduced and an exhaustive codebook search is selectively avoided.

    Abstract translation: 矢量量化码本搜索方法和装置使用支持向量机(“SVM”)来计算超平面,其中超平面用于将码本元素分离成多个箱。 在执行期间,控制器确定多个箱中的哪一个包含期望的码本元素,然后搜索所确定的仓。 减少了码本搜索的复杂度,并选择性地避免了详尽的码本搜索。

    Method and apparatus for robust speech classification
    30.
    发明授权
    Method and apparatus for robust speech classification 有权
    鲁棒语音分类的方法和装置

    公开(公告)号:US07472059B2

    公开(公告)日:2008-12-30

    申请号:US09733740

    申请日:2000-12-08

    Applicant: Pengjun Huang

    Inventor: Pengjun Huang

    CPC classification number: G10L25/93 G10L19/025 G10L19/22 G10L25/78

    Abstract: A speech classification technique for robust classification of varying modes of speech to enable maximum performance of multi-mode variable bit rate encoding techniques. A speech classifier accurately classifies a high percentage of speech segments for encoding at minimal bit rates, meeting lower bit rate requirements. Highly accurate speech classification produces a lower average encoded bit rate, and higher quality decoded speech. The speech classifier considers a maximum number of parameters for each frame of speech, producing numerous and accurate speech mode classifications for each frame. The speech classifier correctly classifies numerous modes of speech under varying environmental conditions. The speech classifier inputs classification parameters from external components, generates internal classification parameters from the input parameters, sets a Normalized Auto-correlation Coefficient Function threshold and selects a parameter analyzer according to the signal environment, and then analyzes the parameters to produce a speech mode classification.

    Abstract translation: 一种语音分类技术,用于对不同语音模式进行鲁棒分类,以实现多模式可变比特率编码技术的最大性能。 语音分类器以最低比特率对用于编码的高百分比的语音段进行精确的分类,满足较低的比特率要求。 高精度的语音分类产生较低的平均编码比特率和更高质量的解码语音。 语音分类器考虑每个语音帧的最大参数数,为每个帧产生大量且准确的语音模式分类。 语音分类器在不同的环境条件下正确分类了许多语音模式。 语音分类器从外部组件输入分类参数,从输入参数生成内部分类参数,设置归一化自相关系数函数阈值,并根据信号环境选择参数分析仪,然后分析参数以产生语音模式分类 。

Patent Agency Ranking