Using signal to noise ratio of a speech signal to adjust thresholds for extracting speech parameters for coding the speech signal
    1.
    发明授权
    Using signal to noise ratio of a speech signal to adjust thresholds for extracting speech parameters for coding the speech signal 有权
    使用语音信号的信噪比来调整用于提取用于编码语音信号的语音参数的阈值

    公开(公告)号:US06898566B1

    公开(公告)日:2005-05-24

    申请号:US09640841

    申请日:2000-08-16

    摘要: There are provided speech coding methods and systems for estimating a plurality of speech parameters of a speech signal for coding the speech signal using one of a plurality of speech coding algorithms, the plurality of speech parameters includes pitch information, the plurality of speech parameters is calculated using a plurality of thresholds. An example method includes estimating a background noise level in the speech signal to determine a signal to noise ratio (SNR) for the speech signal, adjusting one or more of the plurality of thresholds based on the SNR to generate one or more SNR adjusted thresholds, analyzing the speech signal to extract the pitch information using the one or more SNR adjusted thresholds, and repeating the estimating, the adjusting and the analyzing to code the speech signal using one the plurality of speech coding algorithms.

    摘要翻译: 提供了语音编码方法和系统,用于使用多种语音编码算法中的一种来估计用于对语音信号进行编码的语音信号的多个语音参数,所述多个语音参数包括音调信息,所述多个语音参数被计算 使用多个阈值。 示例性方法包括估计语音信号中的背景噪声电平以确定语音信号的信噪比(SNR),基于SNR调整多个阈值中的一个或多个阈值以产生一个或多个SNR调整阈值, 分析语音信号以使用一个或多个SNR调整的阈值提取音调信息,并且使用多个语音编码算法中的一个重复对该语音信号的估计,调整和分析。

    Deriving seed values to generate excitation values in a speech coder
    2.
    发明授权
    Deriving seed values to generate excitation values in a speech coder 有权
    导出种子值以在语音编码器中产生激励值

    公开(公告)号:US07146309B1

    公开(公告)日:2006-12-05

    申请号:US10653874

    申请日:2003-09-02

    IPC分类号: G10L19/00

    CPC分类号: G10L19/08

    摘要: There are provided methods and devices for generating excitation values for a speech signal. In one aspect, an example method comprises obtaining one or more characteristics of a first speech frame of the speech signal, deriving a first seed value based on the one or more characteristics of the first speech frame, providing the first seed value to a Gaussian time series generator; and using the Gaussian time series generator to generate an excitation values for the first frame. The one or more characteristics may include a spectrum information of the first frame, an energy information of the first frame, or a gain information of the first frame.

    摘要翻译: 提供了用于产生语音信号的激励值的方法和装置。 在一个方面,示例性方法包括获得语音信号的第一语音帧的一个或多个特征,基于第一语音帧的一个或多个特征导出第一种子值,将第一种子值提供给高斯时间 串联发电机; 并使用高斯时间序列发生器来产生第一帧的激励值。 一个或多个特征可以包括第一帧的频谱信息,第一帧的能量信息或第一帧的增益信息。

    Conference bridge processing of speech in a packet network environment
    3.
    发明授权
    Conference bridge processing of speech in a packet network environment 有权
    会议桥处理语音在分组网环境中

    公开(公告)号:US06463414B1

    公开(公告)日:2002-10-08

    申请号:US09547832

    申请日:2000-04-12

    IPC分类号: G10L1102

    CPC分类号: G10L19/173

    摘要: There is provided a conference bridge or transcoder configured to intelligently handle multiple speech channels in the contest of a packet network, wherein various speech channels may adhere to variety of speech encoding standards. For example, the conference bridge establishes framing and alignment of multiple incoming speech channels associated with multiple participants, extracts parameters from the speech samples, mixes the parameters, and re-encodes the resulting speech samples for transmission to the participants. In one aspect, a speech processing method comprises decoding a first bitstream according to a first coding scheme to generate first speech samples and a first side information; generating second speech samples and a second side information using the first speech samples and the first side information, for use according to a second coding scheme; and creating a second bitstream, encoded based on the second coding scheme, using the second speech samples and the second side information.

    摘要翻译: 提供了一种配置成在分组网络的比赛中智能地处理多个语音信道的会议桥或代码转换器,其中各种语音信道可以遵循各种语音编码标准。 例如,会议桥建立与多个参与者相关联的多个输入语音信道的成帧和对准,从语音样本中提取参数,混合参数,并对所得到的语音样本进行重新编码以传输给参与者。 一方面,语音处理方法包括根据第一编码方案对第一比特流进行解码,以产生第一语音样本和第一侧信息; 使用第一语音样本和第一侧信息生成第二语音样本和第二侧信息,以便根据第二编码方案使用; 以及使用所述第二语音样本和所述第二侧信息来创建基于所述第二编码方案编码的第二比特流。

    Fast echo canceller reconvergence after TDM slips and echo level changes
    4.
    发明申请
    Fast echo canceller reconvergence after TDM slips and echo level changes 有权
    TDM回波消除和回波电平变化后的快速回波消除器重新收敛

    公开(公告)号:US20060198511A1

    公开(公告)日:2006-09-07

    申请号:US11072476

    申请日:2005-03-03

    IPC分类号: H04M9/08

    CPC分类号: H04B3/234

    摘要: A method of adjusting an echo canceller comprises obtaining a first cross-correlation between a far-end signal and an error signal, wherein the error signal is generated by subtracting an output signal of an adaptive filter from a local-end signal; determining whether the first cross-correlation is above a pre-determined threshold; relocating the adaptive filter by a few samples if the determining determines that the first cross-correlation is above a pre-determined threshold; calculating a first improvement indicator parameter, wherein the first improvement indicator parameter is calculated after the relocating the adaptive filter by the few samples; determining whether the first improvement indicator parameter indicates a performance improvement by the adaptive filter after the relocating the adaptive filter by the few samples; calculating a gain based on the local-end signal and the error signal if the determining does not determine the performance improvement; and multiplying the adaptive filter by the gain.

    摘要翻译: 一种调整回波消除器的方法包括获得远端信号和误差信号之间的第一互相关,其中通过从本地端信号中减去自适应滤波器的输出信号来产生误差信号; 确定所述第一互相关是否高于预定阈值; 如果确定确定第一互相关高于预定阈值,则将自适应滤波器重定位几个样本; 计算第一改进指标参数,其中在通过所述少数样本重定位所述自适应滤波器之后计算所述第一改进指标参数; 在由所述少数样本重新定位所述自适应滤波器之后,确定所述第一改进指示符参数是否指示所述自适应滤波器的性能改善; 如果确定不确定性能改进,则基于本地端信号和误差信号计算增益; 并将自适应滤波器乘以增益。

    Speech encoder using voice activity detection in coding noise
    5.
    发明授权
    Speech encoder using voice activity detection in coding noise 有权
    语音编码器使用语音活动检测编码噪声

    公开(公告)号:US06823303B1

    公开(公告)日:2004-11-23

    申请号:US09156832

    申请日:1998-09-18

    IPC分类号: G10L1904

    摘要: A multi-rate speech codec supports a plurality of encoding bit rate modes by adaptively selecting encoding bit rate modes to match communication channel restrictions. In higher bit rate encoding modes, an accurate representation of speech through CELP (code excited linear prediction) and other associated modeling parameters are generated for higher quality decoding and reproduction. For each bit rate mode selected, pluralities of fixed or innovation subcodebooks are selected for use in generating innovation vectors. The speech coder distinguishes various voice signals as a function of their voice content. For example, a Voice Activity Detection (VAD) algorithm selects an appropriate coding scheme depending on whether the speech signal comprises active or inactive speech. The encoder may consider varying characteristics of the speech signal including sharpness, a delay correlation, a zero-crossing rate, and a residual energy. In another embodiment of the present invention, code excited linear prediction is used for voice active signals whereas random excitation is used for voice inactive signals; the energy level and spectral content of the voice inactive signal may also be used for noise coding.

    摘要翻译: 多速率语音编解码器通过自适应地选择编码比特率模式以匹配通信信道限制来支持多种编码比特率模式。 在较高的比特率编码模式中,通过CELP(码激励线性预测)和其他相关联的建模参数的语音的精确表示被生成用于更高质量的解码和再现。 对于所选择的每个比特率模式,选择多个固定或创新子码本来用于产生创新向量。 语音编码器将各种语音信号区分为其语音内容的函数。 例如,语音活动检测(VAD)算法根据语音信号是否包括有源或非活动语音来选择适当的编码方案。 编码器可以考虑包括锐度,延迟相关性,零交叉速率和剩余能量的语音信号的变化特性。 在本发明的另一实施例中,码激励线性预测用于语音有源信号,而随机激励用于语音无效信号; 语音无效信号的能级和频谱内容也可用于噪声编码。

    Double talk detector for echo cancellation in a speech communication system
    6.
    发明授权
    Double talk detector for echo cancellation in a speech communication system 有权
    用于语音通信系统中的回声消除的双向通话检测器

    公开(公告)号:US06804203B1

    公开(公告)日:2004-10-12

    申请号:US09663246

    申请日:2000-09-15

    IPC分类号: G01R3108

    CPC分类号: H04M9/082

    摘要: A speech communication system is provided that uses pitch information, pitch lags, pitch gains, energy and/or other speech characteristics about the outgoing speech and the unknown signal on a frame basis to determine if the unknown signal is an echo signal of the outgoing speech or if the unknown signal also contains speech from a second talker (double talk). Additionally, a plurality of frames of these characteristics of the outgoing speech signal and the unknown incoming signal may be buffered so that the analysis and comparison can be made more efficiently and quickly in the frame domain as opposed to a time domain. Multiple characteristics may be optionally weighted and then analyzed. The system and method may further determine a level of confidence, based on any criterion, in the determination that may then be used to adjust the gain of a filter.

    摘要翻译: 提供一种语音通信系统,其基于帧基于使用音调信息,音调延迟,音调增益,能量和/或关于输出语音和未知信号的其他语音特性,以确定未知信号是否是出局语音的回波信号 或者如果未知信号还包含来自第二讲话者的语音(双语)。 此外,输出语音信号和未知输入信号的这些特性的多个帧可以被缓冲,使得可以在帧域中比时域更加有效和快速地进行分析和比较。 可以可选地对多个特征加权并分析。 系统和方法可以基于任何标准来确定可能随后用于调整滤波器的增益的确定中的置信水平。

    Speech communication system and method for handling lost frames
    7.
    发明授权
    Speech communication system and method for handling lost frames 有权
    用于处理丢帧的语音通信系统和方法

    公开(公告)号:US06636829B1

    公开(公告)日:2003-10-21

    申请号:US09617191

    申请日:2000-07-14

    IPC分类号: G10L1900

    摘要: An exemplary decoder comprises a receiver that receives parameters of a speech signal on a frame-by-frame basis, a control logic for decoding parameters and for resynthesizing the speech signal, the control logic including a minimum spacing indicative of a minimum difference required between LSFs of consecutive frames, a frame recovery logic that, when a lost frame detector detects a lost frame, sets the minimum spacing for the lost frame to a first value which is greater than the minimum spacing for the previously received frame, and/or uses pitch lag parameters of a plurality of previously received frames to extrapolate a pitch lag parameter for the lost frame, and/or sets gain parameter of a subframe of the lost frame in a first manner if the lost gain parameter is an adaptive codebook gain parameter and in a second manner if the lost gain parameter is a fixed codebook gain parameter.

    摘要翻译: 示例性解码器包括接收器,其逐帧地接收语音信号的参数,用于解码参数并用于再合成语音信号的控制逻辑,所述控制逻辑包括指示LSF之间所需的最小差异的最小间隔 连续帧的帧恢复逻辑,当丢失帧检测器检测到丢失帧时,将丢失帧的最小间隔设置为大于先前接收帧的最小间隔的第一值,和/或使用间距 多个先前接收的帧的滞后参数,以推断丢失帧的音调滞后参数,和/或以丢失的增益参数为自适应码本增益参数,以第一种方式设置丢失帧的子帧的增益参数,并且 丢失增益参数是固定码本增益参数的第二种方式。

    Method for coding speech containing noise-like speech periods and/or having background noise
    8.
    发明授权
    Method for coding speech containing noise-like speech periods and/or having background noise 有权
    用于对包含噪声的语音周期和/或具有背景噪声的语音进行编码的方法

    公开(公告)号:US06205423B1

    公开(公告)日:2001-03-20

    申请号:US09420876

    申请日:1999-10-19

    IPC分类号: G10L1904

    摘要: A method of coding speech under background noise conditions or during noise-like speech periods wherein during active voice speech segments an analysis-by-synthesis method is used. However, when a background noise segment or noise-like speech segment is detected, an adaptive code book (pitch prediction) contribution is used as a source of a pseudo-random sequence in order to provide a better representation of the background noise or the noise-like speech. An improved gain quantization scheme is also employed when a background noise segment is detected, wherein energy of the total excitation with quantized gains is matched to the energy of total excitation with unquantized gains.

    摘要翻译: 一种在背景噪声条件下或在噪声状语音周期期间对语音进行编码的方法,其中在活动语音语音段中使用按合成分析方法。 然而,当检测到背景噪声段或类噪声语音段时,将自适应码本(音调预测)​​贡献用作伪随机序列的源,以便提供背景噪声或噪声的更好表示 像演讲 当检测到背景噪声段时,还采用改进的增益量化方案,其中具有量化增益的总激励的能量与具有非量化增益的总激励能量相匹配。

    Signal compression using index mapping technique for the sharing of
quantization tables
    9.
    发明授权
    Signal compression using index mapping technique for the sharing of quantization tables 失效
    信号压缩使用索引映射技术共享量化表

    公开(公告)号:US5920853A

    公开(公告)日:1999-07-06

    申请号:US702780

    申请日:1996-08-23

    摘要: A signal compression system includes a coder and a decoder. The coder includes an extract unit for extracting an input feature vector from an input signal, a coder memory unit for storing a predesigned vector quantization (VQ) table for the coder such that the coder memory unit uses a set of primary indices to address entries within the pre-designed VQ table, a coder mapping unit for mapping indices from a set of secondary indices to the first set of indices, and a search unit for searching for one index out of the set of secondary indices, wherein the index from the set of secondary indices corresponds to an entry in the coder memory unit, and the entry best represents the input feature vector according to some predetermined criteria. On the decoder side, the decoder includes a decoder memory unit for storing the same pre-designed VQ table and set of primary indices as the coder memory unit, a decoder mapping unit, and a retrieval unit, wherein the entry indicated by the index best represents the input feature vector.

    摘要翻译: 信号压缩系统包括编码器和解码器。 编码器包括用于从输入信号提取输入特征向量的提取单元,编码器存储单元,用于存储用于编码器的预先设计的矢量量化(VQ)表,使得编码器存储单元使用一组主要索引来寻址 预先设计的VQ表,用于映射从一组二次索引到第一组索引的索引的编码器映射单元,以及用于搜索该次要索引集合中的一个索引的搜索单元,其中来自该集合的索引 次要索引对应于编码器存储单元中的条目,并且条目最好地表示根据某些预定标准的输入特征向量。 在解码器侧,解码器包括解码器存储器单元,用于存储与编码器存储单元相同的预先设计的VQ表和一组主要索引,解码器映射单元和检索单元,其中由索引最佳指示的条目 代表输入特征向量。

    Usage of voice activity detection for efficient coding of speech
    10.
    发明授权
    Usage of voice activity detection for efficient coding of speech 失效
    语音活动检测用于语音的有效编码

    公开(公告)号:US5689615A

    公开(公告)日:1997-11-18

    申请号:US589132

    申请日:1996-01-22

    CPC分类号: G10L19/18

    摘要: A method for efficient coding of non-active voice periods is disclosed for a speech communication system with (a) a speech encoder, (b) a communication channel and (c) a speech decoder. The method intermittently sends some information about the background noise when necessary in order to give a better quality of overall speech when non-active voice frames are detected. The coding efficiency of the non-active voice frames can achieved by coding the energy of the frame and its spectrum with as few as 15 bits. These bits are not automatically transmitted whenever there is a non-active voice detection. Rather, the bits are transmitted only when an appreciable change has been detected with respect to the last time a non-active voice frame was sent. To appreciate the benefits of the present invention, a good overall quality can be achieved at rate as low as 4 kb/s on the average during normal speech conversation.

    摘要翻译: 对于具有(a)语音编码器,(b)通信信道和(c)语音解码器)的语音通信系统,公开了一种用于非活动语音周期的有效编码的方法。 该方法在需要时间歇地发送关于背景噪声的信息,以便在检测到非活动语音帧时给出更好的总体语音质量。 非活动语音帧的编码效率可以通过将帧的能量及其频谱编码为少至15位来实现。 无论何时出现无效语音检测,这些位都不会自动传输。 相反,只有当相对于最后一次发送非有效语音帧时检测到明显的改变时,这些位才被发送。 为了理解本发明的优点,在正常语音对话期间平均可以以低至4kb / s的速率实现良好的整体质量。