Encoding and decoding speech signals variably based on signal classification
    1.
    发明授权
    Encoding and decoding speech signals variably based on signal classification 有权
    基于信号分类对语音信号进行编码和解码

    公开(公告)号:US06735567B2

    公开(公告)日:2004-05-11

    申请号:US10409430

    申请日:2003-04-08

    IPC分类号: G10L1304

    摘要: A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codecs are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech.

    摘要翻译: 公开了能够将语音信号编码为比特流以进行后续解码以产生合成语音的语音压缩系统。 语音压缩系统通过将期望的平均比特率与重构语音的感知质量进行平衡来优化比特流消耗的带宽。 语音压缩系统包括全速率编解码器,半速率编解码器,四分之一速率编解码器和八速率编解码器。 基于速率选择来选择性地激活编解码器。 此外,基于类型分类,全速率和半速率编解码器被选择性地激活。 选择性地激活每个编解码器以以强调语音信号的不同方面的不同比特率对语音信号进行编码和解码,以增强合成语音的整体质量。

    Bitstream protocol for transmission of encoded voice signals
    2.
    发明授权
    Bitstream protocol for transmission of encoded voice signals 有权
    用于传输编码语音信号的比特流协议

    公开(公告)号:US06581032B1

    公开(公告)日:2003-06-17

    申请号:US09662828

    申请日:2000-09-15

    IPC分类号: G10L1912

    摘要: A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codecs are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech.

    摘要翻译: 公开了能够将语音信号编码为比特流以进行后续解码以产生合成语音的语音压缩系统。 语音压缩系统通过将期望的平均比特率与重构语音的感知质量进行平衡来优化比特流消耗的带宽。 语音压缩系统包括全速率编解码器,半速率编解码器,四分之一速率编解码器和八速率编解码器。 基于速率选择来选择性地激活编解码器。 此外,基于类型分类,全速率和半速率编解码器被选择性地激活。 选择性地激活每个编解码器以以强调语音信号的不同方面的不同比特率对语音信号进行编码和解码,以增强合成语音的整体质量。

    Speech coding system and method using bi-directional mirror-image predicted pulses
    3.
    发明申请
    Speech coding system and method using bi-directional mirror-image predicted pulses 有权
    使用双向镜像预测脉冲的语音编码系统和方法

    公开(公告)号:US20090043574A1

    公开(公告)日:2009-02-12

    申请号:US12284623

    申请日:2008-09-23

    IPC分类号: G10L19/12 G10L19/00

    摘要: There is provided a method of decoding speech data generated from a speech signal. The method comprises receiving the speech data having at least one main pulse in a subframe of the speech data; generating a first predicted pulse, based on the at least one main pulse, on one side of the main pulse in the subframe of the speech data, wherein the first predicted pulse has a lower gain than the main pulse; generating a second predicted pulse, as a mirror image of the first predicted pulse on a reverse time scale, on the other side of the main pulse in the subframe of the speech data; reconstructing the speech signal using the at least one main pulse, the first predicted pulse and the second predicted pulse.

    摘要翻译: 提供了一种对从语音信号产生的语音数据进行解码的方法。 该方法包括:接收语音数据的子帧中具有至少一个主脉冲的语音数据; 基于所述至少一个主脉冲在所述语音数据的子帧中的所述主脉冲的一侧产生第一预测脉冲,其中所述第一预测脉冲具有比所述主脉冲更低的增益; 在语音数据的子帧中的主脉冲的另一侧上产生第二预测脉冲作为反时限上的第一预测脉冲的镜像; 使用所述至少一个主脉冲,所述第一预测脉冲和所述第二预测脉冲来重构所述语音信号。

    Adaptive noise state update for a voice activity detector
    4.
    发明授权
    Adaptive noise state update for a voice activity detector 有权
    语音活动检测器的自适应噪声状态更新

    公开(公告)号:US07346502B2

    公开(公告)日:2008-03-18

    申请号:US11342130

    申请日:2006-01-26

    IPC分类号: G10L11/06

    CPC分类号: G10L25/78 G10L2025/786

    摘要: There is provided a method of updating a noise state of a voice activity detector (VAD) for indicating an active voice mode and an inactive voice mode. The method comprises receiving an input signal having a plurality of frames, determining an elapsed time since the last update of the noise state, updating the noise state of the VAD if the elapsed time exceeds a predetermined time, determining an average minimum energy based on two or more of the plurality of frames, determining a current minimum energy based on a current frame of the plurality of frames, updating the noise state of the VAD if the average minimum energy is less than the current minimum energy, and updating the noise state of the VAD if the average minimum energy is greater than the current minimum energy plus a first predetermined value.

    摘要翻译: 提供了一种更新用于指示主动语音模式和无效语音模式的语音活动检测器(VAD)的噪声状态的方法。 该方法包括接收具有多个帧的输入信号,确定自上次更新噪声状态以来经过的时间,如果经过时间超过预定时间,则更新VAD的噪声状态,基于二次确定平均最小能量 或更多个帧,基于多个帧的当前帧确定当前最小能量,如果平均最小能量小于当前最小能量,则更新VAD的噪声状态,并且更新噪声状态 VAD,如果平均最小能量大于当前最小能量加上第一预定值。

    Conference bridge processing of speech in a packet network environment
    6.
    发明授权
    Conference bridge processing of speech in a packet network environment 有权
    会议桥处理语音在分组网环境中

    公开(公告)号:US06463414B1

    公开(公告)日:2002-10-08

    申请号:US09547832

    申请日:2000-04-12

    IPC分类号: G10L1102

    CPC分类号: G10L19/173

    摘要: There is provided a conference bridge or transcoder configured to intelligently handle multiple speech channels in the contest of a packet network, wherein various speech channels may adhere to variety of speech encoding standards. For example, the conference bridge establishes framing and alignment of multiple incoming speech channels associated with multiple participants, extracts parameters from the speech samples, mixes the parameters, and re-encodes the resulting speech samples for transmission to the participants. In one aspect, a speech processing method comprises decoding a first bitstream according to a first coding scheme to generate first speech samples and a first side information; generating second speech samples and a second side information using the first speech samples and the first side information, for use according to a second coding scheme; and creating a second bitstream, encoded based on the second coding scheme, using the second speech samples and the second side information.

    摘要翻译: 提供了一种配置成在分组网络的比赛中智能地处理多个语音信道的会议桥或代码转换器,其中各种语音信道可以遵循各种语音编码标准。 例如,会议桥建立与多个参与者相关联的多个输入语音信道的成帧和对准,从语音样本中提取参数,混合参数,并对所得到的语音样本进行重新编码以传输给参与者。 一方面,语音处理方法包括根据第一编码方案对第一比特流进行解码,以产生第一语音样本和第一侧信息; 使用第一语音样本和第一侧信息生成第二语音样本和第二侧信息,以便根据第二编码方案使用; 以及使用所述第二语音样本和所述第二侧信息来创建基于所述第二编码方案编码的第二比特流。

    Speech codec employing noise classification for noise compensation
    7.
    发明授权
    Speech codec employing noise classification for noise compensation 有权
    语音编解码器采用噪声分类进行噪声补偿

    公开(公告)号:US06240386B1

    公开(公告)日:2001-05-29

    申请号:US09198414

    申请日:1998-11-24

    IPC分类号: G10L2100

    摘要: A multi-rate speech codec supports a plurality of encoding bit rate modes by adaptively selecting encoding bit rate modes to match communication channel restrictions. In higher bit rate encoding modes, an accurate representation of speech through CELP (code excited linear prediction) and other associated modeling parameters are generated for higher quality decoding and reproduction. For each bit rate mode selected, pluralities of fixed or innovation subcodebooks are selected for use in generating innovation vectors. The speech coder distinguishes various voice signals as a function of their voice content. For example, a Voice Activity Detection (VAD) algorithm selects an appropriate coding scheme depending on whether the speech signal comprises active or inactive speech. The encoder may consider varying characteristics of the speech signal including sharpness, a delay correlation, a zero-crossing rate, and a residual energy. In another embodiment of the present invention, code excited linear prediction is used for voice active signals whereas random excitation is used for voice inactive signals; the energy level and spectral content of the voice inactive signal may also be used for noise coding. The multi-rate speech codec may employ distributed detection and compensation processing the speech signal. For high quality perceptual speech reproduction, the speech codec may perform noise detection in both an encoder and a decoder. The noise detection may be coordinated between the encoder and decoder. Similarly, noise compensation may be performed in a distributed manner among both the decoder and the encoder.

    摘要翻译: 多速率语音编解码器通过自适应地选择编码比特率模式以匹配通信信道限制来支持多种编码比特率模式。 在较高的比特率编码模式中,通过CELP(码激励线性预测)和其他相关联的建模参数的语音的精确表示被生成用于更高质量的解码和再现。 对于所选择的每个比特率模式,选择多个固定或创新子码本来用于产生创新向量。 语音编码器将各种语音信号区分为其语音内容的函数。 例如,语音活动检测(VAD)算法根据语音信号是否包括有源或非活动语音来选择适当的编码方案。 编码器可以考虑包括锐度,延迟相关性,零交叉速率和剩余能量的语音信号的变化特性。 在本发明的另一实施例中,码激励线性预测用于语音有源信号,而随机激励用于语音无效信号; 语音无效信号的能级和频谱内容也可用于噪声编码。 多速率语音编解码器可以采用语音信号的分布式检测和补偿处理。 对于高质量的感知语音再现,语音编解码器可以在编码器和解码器中执行噪声检测。 可以在编码器和解码器之间协调噪声检测。 类似地,可以在解码器和编码器之间以分布式方式执行噪声补偿。

    Selection of scalar quantixation (SQ) and vector quantization (VQ) for speech coding

    公开(公告)号:US08620647B2

    公开(公告)日:2013-12-31

    申请号:US12321935

    申请日:2009-01-26

    IPC分类号: G10L11/06

    摘要: In accordance with one aspect of the invention, a selector supports the selection of a first encoding scheme or the second encoding scheme based upon the detection or absence of the triggering characteristic in the interval of the input speech signal. The first encoding scheme has a pitch pre-processing procedure for processing the input speech signal to form a revised speech signal biased toward an ideal voiced and stationary characteristic. The pre-processing procedure allows the encoder to fully capture the benefits of a bandwidth-efficient, long-term predictive procedure for a greater amount of speech components of an input speech signal than would otherwise be possible. In accordance with another aspect of the invention, the second encoding scheme entails a long-term prediction mode for encoding the pitch on a sub-frame by sub-frame basis. The long-term prediction mode is tailored to where the generally periodic component of the speech is generally not stationary or less than completely periodic and requires greater frequency of updates from the adaptive codebook to achieve a desired perceptual quality of the reproduced speech under a long-term predictive procedure.

    Embedded silence and background noise compression
    9.
    发明授权
    Embedded silence and background noise compression 有权
    嵌入式静音和背景噪声压缩

    公开(公告)号:US08032359B2

    公开(公告)日:2011-10-04

    申请号:US12002131

    申请日:2007-12-14

    IPC分类号: G10L21/00 G10L11/06

    摘要: There is provided a method for use by a speech encoder to encode an input speech signal. The method comprises receiving the input speech signal; determining whether the input speech signal includes an active speech signal or an inactive speech signal; low-pass filtering the inactive speech signal to generate a narrowband inactive speech signal; high-pass filtering the inactive speech signal to generate a high-band inactive speech signal; encoding the narrowband inactive speech signal using a narrowband inactive speech encoder to generate an encoded narrowband inactive speech; generating a low-to-high auxiliary signal by the narrowband inactive speech encoder based on the narrowband inactive speech signal; encoding the high-band inactive speech signal using a wideband inactive speech encoder to generate an encoded wideband inactive speech based on the low-to-high auxiliary signal from the narrowband inactive speech encoder; and transmitting the encoded narrowband inactive speech and the encoded wideband inactive speech.

    摘要翻译: 提供了一种由语音编码器用于对输入语音信号进行编码的方法。 该方法包括接收输入语音信号; 确定所述输入语音信号是否包括活动语音信号或无效语音信号; 低通滤波无效语音信号以产生窄带无效语音信号; 高通滤波无效语音信号以产生高频带无效语音信号; 使用窄带无源语音编码器对窄带无源语音信号进行编码,以生成编码窄带无效语音; 基于窄带无效语音信号,由窄带无源语音编码器生成低到高的辅助信号; 使用宽带无源语音编码器对高频带无效语音信号进行编码,以根据来自窄带无源语音编码器的低到高辅助信号产生编码的宽带无效语音; 以及发送编码的窄带无效语音和编码的宽带无效语音。