Timing recovery scheme for packet speech in multiplexing environment of
voice with data applications
    1.
    发明授权
    Timing recovery scheme for packet speech in multiplexing environment of voice with data applications 失效
    数据语音复用环境中数据包语音的定时恢复方案

    公开(公告)号:US5699481A

    公开(公告)日:1997-12-16

    申请号:US443651

    申请日:1995-05-18

    摘要: Multiple speech bit-stream frame buffers are used between the controller and the speech decoder. Whenever excessive or missing speech packages are detected, the speech decoder switches to a special corrective mode. If there is too much, the buffered frames are played out fast; if there is too little the buffered frames are played out slowly. For the fast play, some speech information has to be discarded, while for the slow play some speech-like information has to be synthesized. The speech may be handled in sub-frame units, which may be 52 samples at a time. Low energy, silent or unvoiced sub-frames, which also indicate non-periodicity, are detected and manipulated. Moreover, the decoded signal is manipulated at the excitation phase, before the final LPC synthesis filter, resulting in a transparent perceptual effect on the manipulated speech quality. Additionally, the buffers are enlarged such that the problem caused by controller asynchronicity is eliminated. Further, for bulk delay caused by multiplexing data and speech transmissions, the buffers maintain the smallest number of speech packets necessary to prevent buffer underflow during a data packet transmission while minimizing speech delay and preserving data transmission efficiency.

    摘要翻译: 在控制器和语音解码器之间使用多个语音比特流帧缓冲器。 每当检测到过多或丢失的语音包时,语音解码器切换到特殊的校正模式。 如果太多,缓冲的帧将被快速播放; 如果缓存的帧缓存太慢, 对于快速播放,一些语音信息必须被丢弃,而对于慢播,一些语音信息必须被合成。 语音可以以子帧单位处理,一次可以是52个样本。 低能量,无声或无声子帧,也表示非周期性,被检测和操纵。 此外,在最终LPC合成滤波器之前,在激励阶段处理解码信号,导致对被操纵的语音质量的透明感知效应。 此外,缓冲器被放大,从而消除了由控制器异步引起的问题。 此外,对于由复用数据和语音传输引起的批量延迟,缓冲器保持在数据分组传输期间防止缓冲器下溢所需的最小数量的语音分组,同时最小化语音延迟并保持数据传输效率。

    Codebook sharing for LSF quantization
    2.
    发明授权
    Codebook sharing for LSF quantization 有权
    LSF量化的码本共享

    公开(公告)号:US08635063B2

    公开(公告)日:2014-01-21

    申请号:US12321950

    申请日:2009-01-26

    申请人: Yang Gao Eyal Shlomot

    发明人: Yang Gao Eyal Shlomot

    IPC分类号: G10L11/06

    摘要: In accordance with one aspect of the invention, a selector supports the selection of a first encoding scheme or the second encoding scheme based upon the detection or absence of the triggering characteristic in the interval of the input speech signal. The first encoding scheme has a pitch pre-processing procedure for processing the input speech signal to form a revised speech signal biased toward an ideal voiced and stationary characteristic. The pre-processing procedure allows the encoder to fully capture the benefits of a bandwidth-efficient, long-term predictive procedure for a greater amount of speech components of an input speech signal than would otherwise be possible. In accordance with another aspect of the invention, the second encoding scheme entails a long-term prediction mode for encoding the pitch on a sub-frame by sub-frame basis. The long-term prediction mode is tailored to where the generally periodic component of the speech is generally not stationary or less than completely periodic and requires greater frequency of updates from the adaptive codebook to achieve a desired perceptual quality of the reproduced speech under a long-term predictive procedure.

    摘要翻译: 根据本发明的一个方面,选择器基于输入语音信号的间隔中的触发特性的检测或不存在,支持选择第一编码方案或第二编码方案。 第一编码方案具有用于处理输入语音信号以形成偏向理想有声和静态特征的修正语音信号的音调预处理过程。 预处理过程允许编码器完全捕获带宽有效的长期预测程序的优点,用于输入语音信号的大量语音分量比否则可能的更多。 根据本发明的另一方面,第二编码方案需要一种长期预测模式,用于以子帧为基础对子帧上的音调进行编码。 长期预测模式被定制为语音的大致周期性分量通常不是静止的或小于完全周期性的,并且需要来自自适应码本的更高频率的更新以在长时间内实现再现语音的期望感知质量, 术语预测程序。

    Signal classifying method and apparatus
    3.
    发明授权
    Signal classifying method and apparatus 有权
    信号分类方法和装置

    公开(公告)号:US08438021B2

    公开(公告)日:2013-05-07

    申请号:US12979994

    申请日:2010-12-28

    IPC分类号: G10L15/20 G10L11/04 G10L15/04

    CPC分类号: G10L25/81 G10L2025/786

    摘要: A signal classifying method and apparatus are disclosed. The signal classifying method includes: obtaining a spectrum fluctuation parameter of a current signal frame determined as a foreground frame, and buffering the spectrum fluctuation parameter; obtaining a spectrum fluctuation variance of the current signal frame according to spectrum fluctuation parameters of all buffered signal frames, and buffering the spectrum fluctuation variance; and calculating a ratio of signal frames whose spectrum fluctuation variance is above or equal to a first threshold to all the buffered signal frames, and determining the current signal frame as a speech frame if the ratio is above or equal to a second threshold or determining the current signal frame as a music frame if the ratio is below the second threshold. In the embodiments of the present disclosure, the spectrum fluctuation variance of the signal is used as a parameter for classifying the signals, and a local statistical method is applied to decide the type of the signal. Therefore, the signals are classified with few parameters, simple logical relations and low complexity.

    摘要翻译: 公开了一种信号分类方法和装置。 信号分类方法包括:获取确定为前景帧的当前信号帧的频谱波动参数,并缓冲频谱波动参数; 根据所有缓冲信号帧的频谱波动参数获得当前信号帧的频谱波动方差,并缓存频谱波动方差; 并且计算频谱波动方差高于或等于第一阈值的信号帧对所有缓冲信号帧的比率,以及如果比率高于或等于第二阈值则确定当前信号帧为语音帧,或者确定 如果比率低于第二阈值,则将当前信号帧作为音乐帧。 在本公开的实施例中,信号的频谱波动方差被用作用于对信号进行分类的参数,并且应用局部统计方法来确定信号的类型。 因此,信号分为几个参数,简单的逻辑关系和低复杂度。

    Method and apparatus for encoding and decoding
    4.
    发明授权
    Method and apparatus for encoding and decoding 有权
    用于编码和解码的方法和装置

    公开(公告)号:US08370135B2

    公开(公告)日:2013-02-05

    申请号:US12820805

    申请日:2010-06-22

    IPC分类号: G10L21/02

    CPC分类号: G10L19/012

    摘要: An encoding method includes extracting background noise characteristic parameters within a hangover period, for a first superframe after the hangover period, performing background noise encoding based on the extracted background noise characteristic parameters, for superframes after the first superframe, performing background noise characteristic parameter extraction and DTX decision for each frame in the superframes after the first superframe, and for the superframes after the first superframe, performing background noise encoding based on extracted background noise characteristic parameters of the current superframe, background noise characteristic parameters of a plurality of superframes previous to the current superframe, and a final DTX decision. Also, a decoding method and apparatus and an encoding apparatus are disclosed. Bandwidth occupancy may be reduced substantially while the signal quality may be guaranteed.

    摘要翻译: 一种编码方法,包括:在宿醉期后,针对第一超帧,在所述宿醉期后提取背景噪声特性参数,对所述第一超帧之后的超帧执行基于所提取的背景噪声特性参数的背景噪声编码,执行背景噪声特征参数提取,以及 在第一超帧之后的超帧中的每帧的DTX决定以及第一超帧之后的超帧,基于提取的当前超帧的背景噪声特性参数执行背景噪声编码,在先前的多个超帧之前的多个超帧的背景噪声特性参数 目前的超帧,最后的DTX决定。 另外,公开了一种解码方法和装置以及编码装置。 可以在保证信号质量的同时大幅度减小带宽占用。

    Decoder with embedded silence and background noise compression
    5.
    发明授权
    Decoder with embedded silence and background noise compression 有权
    解码器具有嵌入式静音和背景噪声压缩

    公开(公告)号:US08195450B2

    公开(公告)日:2012-06-05

    申请号:US13199794

    申请日:2011-09-08

    IPC分类号: G10L21/00 G10L11/06

    摘要: There is provided a method for use by a speech encoder to encode an input speech signal. The method comprises receiving the input speech signal; determining whether the input speech signal includes an active speech signal or an inactive speech signal; low-pass filtering the inactive speech signal to generate a narrowband inactive speech signal; high-pass filtering the inactive speech signal to generate a high-band inactive speech signal; encoding the narrowband inactive speech signal using a narrowband inactive speech encoder to generate an encoded narrowband inactive speech; generating a low-to-high auxiliary signal by the narrowband inactive speech encoder based on the narrowband inactive speech signal; encoding the high-band inactive speech signal using a wideband inactive speech encoder to generate an encoded wideband inactive speech based on the low-to-high auxiliary signal from the narrowband inactive speech encoder; and transmitting the encoded narrowband inactive speech and the encoded wideband inactive speech.

    摘要翻译: 提供了一种由语音编码器用于对输入语音信号进行编码的方法。 该方法包括接收输入语音信号; 确定所述输入语音信号是否包括活动语音信号或无效语音信号; 低通滤波无效语音信号以产生窄带无效语音信号; 高通滤波无效语音信号以产生高频带无效语音信号; 使用窄带无源语音编码器对窄带无源语音信号进行编码,以生成编码窄带无效语音; 基于窄带无效语音信号,由窄带无源语音编码器生成低到高的辅助信号; 使用宽带无源语音编码器对高频带无效语音信号进行编码,以根据来自窄带无源语音编码器的低到高辅助信号产生编码的宽带无效语音; 以及发送编码的窄带无效语音和编码的宽带无效语音。

    Multi-stage quantization method and device
    6.
    发明申请
    Multi-stage quantization method and device 有权
    多级量化方法及装置

    公开(公告)号:US20100217753A1

    公开(公告)日:2010-08-26

    申请号:US12772190

    申请日:2010-05-01

    IPC分类号: G06F17/30

    摘要: The invention discloses a multi-stage quantization method, which includes the following steps: obtaining a reference codebook according to a previous stage codebook; obtaining a current stage codebook according to the reference codebook and a scaling factor; and quantizing an input vector by using the current stage codebook. The invention also discloses a multi-stage quantization device. With the invention, the current stage codebook may be obtained according to the previous stage codebook, by using the correlation between the current stage codebook and the previous stage codebook. As a result, it does not require an independent codebook space for the current stage codebook, which saves the storage space and improves the resource usage efficiency.

    摘要翻译: 本发明公开了一种多级量化方法,包括以下步骤:获得根据前一级码本的参考码本; 根据参考码本和缩放因子获得当前阶段码本; 并通过使用当前阶段码本量化输入向量。 本发明还公开了一种多级量化装置。 利用本发明,可以通过使用当前阶段码本和前一级码本之间的相关性,根据前一级码本获得当前级码本。 因此,不需要当前级码本的独立码本空间,可以节省存储空间,提高资源使用效率。

    Perceptual masking of residual echo
    7.
    发明授权
    Perceptual masking of residual echo 有权
    残余回声的感知掩蔽

    公开(公告)号:US07711107B1

    公开(公告)日:2010-05-04

    申请号:US11129450

    申请日:2005-05-12

    IPC分类号: H04M9/08

    CPC分类号: H04B3/234

    摘要: A method of masking a residual echo signal by an echo canceller is provided. The method comprises receiving a far-end signal, adjusting filter coefficients of an adaptive filter in response to the far-end signal, generating an echo model signal based on the far-end signal using the adaptive filter, receiving a near-end signal, subtracting the echo model signal from the near-end signal to generate an output signal, defining a spectral mask based on the near-end signal, wherein the spectral mask is indicative of near-end spectral peaks and near-end spectral valleys, de-emphasizing the output signal in spectral regions of the near-end spectral peaks, and emphasizing the output signal in spectral regions of the near-end spectral valleys, wherein the de-emphasizing occurs during filter coefficients determination for the adaptive filter. A weighted filter may perform the de-emphasizing and the emphasizing operations, where the weighted filter uses medium term spectral characteristics of the near-end signal.

    摘要翻译: 提供了一种通过回波消除器掩蔽残留回波信号的方法。 该方法包括接收远端信号,响应于远端信号调整自适应滤波器的滤波器系数,使用自适应滤波器基于远端信号生成回波模型信号,接收近端信号, 从近端信号减去回波模型信号以产生输出信号,基于近端信号定义频谱屏蔽,其中频谱掩模表示近端谱峰和近端谱谷, 强调近端光谱峰值的光谱区域中的输出信号,并且强调近端光谱谷的光谱区域中的输出信号,其中在自适应滤波器的滤波器系数确定期间发生去加重。 加权滤波器可以执行去强调和强调操作,其中加权滤波器使用近端信号的中期频谱特性。

    Embedded silence and background noise compression
    8.
    发明申请
    Embedded silence and background noise compression 有权
    嵌入式静音和背景噪声压缩

    公开(公告)号:US20080195383A1

    公开(公告)日:2008-08-14

    申请号:US12002131

    申请日:2007-12-14

    IPC分类号: G10L19/14

    摘要: There is provided a method for use by a speech encoder to encode an input speech signal. The method comprises receiving the input speech signal; determining whether the input speech signal includes an active speech signal or an inactive speech signal; low-pass filtering the inactive speech signal to generate a narrowband inactive speech signal; high-pass filtering the inactive speech signal to generate a high-band inactive speech signal; encoding the narrowband inactive speech signal using a narrowband inactive speech encoder to generate an encoded narrowband inactive speech; generating a low-to-high auxiliary signal by the narrowband inactive speech encoder based on the narrowband inactive speech signal; encoding the high-band inactive speech signal using a wideband inactive speech encoder to generate an encoded wideband inactive speech based on the low-to-high auxiliary signal from the narrowband inactive speech encoder; and transmitting the encoded narrowband inactive speech and the encoded wideband inactive speech.

    摘要翻译: 提供了一种由语音编码器用于对输入语音信号进行编码的方法。 该方法包括接收输入语音信号; 确定所述输入语音信号是否包括活动语音信号或无效语音信号; 低通滤波无效语音信号以产生窄带无效语音信号; 高通滤波无效语音信号以产生高频带无效语音信号; 使用窄带无源语音编码器对窄带无源语音信号进行编码,以生成编码窄带无效语音; 基于窄带无效语音信号,由窄带无源语音编码器生成低到高的辅助信号; 使用宽带无源语音编码器对高频带无效语音信号进行编码,以根据来自窄带无源语音编码器的低到高辅助信号产生编码的宽带无效语音; 以及发送编码的窄带无效语音和编码的宽带无效语音。

    Adaptive voice mode extension for a voice activity detector
    9.
    发明申请
    Adaptive voice mode extension for a voice activity detector 有权
    语音活动检测器的自适应语音模式扩展

    公开(公告)号:US20060217973A1

    公开(公告)日:2006-09-28

    申请号:US11342104

    申请日:2006-01-26

    IPC分类号: G10L19/12

    CPC分类号: G10L25/78 G10L2025/786

    摘要: There is provided a voice activity detection method for indicating an active voice mode and an inactive voice mode. The method comprises receiving a first portion of an input signal; determining that the first portion of the input signal includes an active voice signal; indicating the active voice mode in response to the determining that the first portion of the input signal includes the active voice signal; receiving a second portion of the input signal immediately following the first portion of the input signal; determining that the second portion of the input signal includes an inactive voice signal; extending the indicating the active voice mode for a period of time after determining that the second portion of the input signal includes the inactive voice signal, wherein the period of time varies based on one or more conditions; and indicating the inactive voice mode after expiration of the period of time.

    摘要翻译: 提供了一种用于指示主动语音模式和无效语音模式的语音活动检测方法。 该方法包括接收输入信号的第一部分; 确定输入信号的第一部分包括有效语音信号; 响应于确定输入信号的第一部分包括有效语音信号,指示主动语音模式; 接收紧接在输入信号的第一部分之后的输入信号的第二部分; 确定输入信号的第二部分包括不活动的语音信号; 在确定所述输入信号的第二部分包括所述不活动语音信号之后,将所述主动语音模式指示一段时间,其中所述时间段基于一个或多个条件而变化; 并且在该时间段期满之后指示不活动的语音模式。

    Codebook tables for multi-rate encoding and decoding with pre-gain and delayed-gain quantization tables
    10.
    发明授权
    Codebook tables for multi-rate encoding and decoding with pre-gain and delayed-gain quantization tables 有权
    用于具有预增益和延迟增益量化表的多速率编码和解码的码表

    公开(公告)号:US06757649B1

    公开(公告)日:2004-06-29

    申请号:US10409404

    申请日:2003-04-08

    IPC分类号: G10L1912

    摘要: A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codecs are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech.

    摘要翻译: 公开了能够将语音信号编码为比特流以进行后续解码以产生合成语音的语音压缩系统。 语音压缩系统通过将期望的平均比特率与重构语音的感知质量进行平衡来优化比特流消耗的带宽。 语音压缩系统包括全速率编解码器,半速率编解码器,四分之一速率编解码器和八速率编解码器。 基于速率选择来选择性地激活编解码器。 此外,基于类型分类,全速率和半速率编解码器被选择性地激活。 选择性地激活每个编解码器以以强调语音信号的不同方面的不同比特率对语音信号进行编码和解码,以增强合成语音的整体质量。