Pitch determination for speech processing
    1.
    发明申请
    Pitch determination for speech processing 审中-公开
    语音处理的音调确定

    公开(公告)号:US20080147384A1

    公开(公告)日:2008-06-19

    申请号:US12069973

    申请日:2008-02-14

    申请人: Huan-Yu Su Yang Gao

    发明人: Huan-Yu Su Yang Gao

    IPC分类号: G10L11/04

    摘要: There is provided a method of selecting a pitch lag value for a portion of a speech signal, the method comprising: computing a weighted correlation function of the portion of the speech signal for a range of delay times, wherein the weighting of the correlation function depends on both the delay time and a characteristic of one or more previous portions of the speech signal; and selecting the pitch lag value based on a delay time from the range of delay times that maximizes the weighted correlation function.

    摘要翻译: 提供了一种为语音信号的一部分选择音调滞后值的方法,所述方法包括:在延迟时间范围内计算语音信号部分的加权相关函数,其中相关函数的权重取决于 在延迟时间和语音信号的一个或多个先前部分的特性上; 以及从加权相关函数最大化的延迟时间的范围内,基于延迟时间选择音调滞后值。

    Pitch determination based on weighting of pitch lag candidates
    2.
    发明授权
    Pitch determination based on weighting of pitch lag candidates 有权
    基于音调滞后候选的加权的音调确定

    公开(公告)号:US07266493B2

    公开(公告)日:2007-09-04

    申请号:US11251179

    申请日:2005-10-13

    申请人: Huan-Yu Su Yang Gao

    发明人: Huan-Yu Su Yang Gao

    IPC分类号: G10L11/04

    摘要: There is provided a method of selecting a pitch lag value from a plurality of pitch lag candidates for coding a speech signal. The method comprises identifying the plurality of pitch lag candidates from a frame of the speech signal using correlation; classifying the speech signal to obtain a voice classification; determining whether one or more of the plurality of pitch lag candidates are in a temporal neighborhood of one or more previous pitch lag values; favoring the one or more of the plurality of pitch lag candidates determined to be in the temporal neighborhood of the one or more previous pitch lag values, by adaptive weighting, over other ones of the plurality of pitch lag candidates; and selecting the pitch lag value based on the voice classification and the one or more of the plurality of pitch lag candidates favored by the adaptive weighting.

    摘要翻译: 提供了一种从用于编码语音信号的多个音调滞后候选中选择音调滞后值的方法。 该方法包括使用相关性从语音信号的帧中识别多个音调滞后候选; 对语音信号进行分类以获得语音分类; 确定所述多个音调滞后候选中的一个或多个是否在一个或多个先前音调滞后值的时间邻域中; 通过对多个音调滞后候选中的其他音调滞后候选,通过自适应加权来确定被确定为处于一个或多个先前音调滞后值的时间邻域中的多个音调滞后候选中的一个或多个; 以及基于所述语音分类和由所述自适应加权优选的所述多个音调滞后候选中的一个或多个来选择所述音调滞后值。

    Adaptive tilt compensation for synthesized speech residual
    3.
    发明授权
    Adaptive tilt compensation for synthesized speech residual 有权
    用于合成语音残差的自适应倾斜补偿

    公开(公告)号:US06385573B1

    公开(公告)日:2002-05-07

    申请号:US09156826

    申请日:1998-09-18

    申请人: Yang Gao Huan-Yu Su

    发明人: Yang Gao Huan-Yu Su

    IPC分类号: G10L1904

    摘要: A multi-rate speech codec supports a plurality of encoding bit rate modes by adaptively selecting encoding bit rate modes to match communication channel restrictions. In higher bit rate encoding modes, an accurate representation of speech through CELP (code excited linear prediction) and other associated modeling parameters are generated for higher quality decoding and reproduction. To achieve high quality in lower bit rate encoding modes, the speech encoder departs from the strict waveform matching criteria of regular CELP coders and strives to identify significant perceptual features of the input signal. To support lower bit rate encoding modes, a variety of techniques are applied many of which involve the classification of the input signal. For each bit rate mode selected, pluralities of fixed or innovation subcodebooks are selected for use in generating innovation vectors. At lower encoding bit rates, a decoder utilizes adaptive compensation to attempt to correct for spectral variations in the weighted synthesized residual. Although many approaches are possible, a long asymmetric window is applied to the synthesized residual to generate a reflection coefficient that is smoothed, scaled and used in a first order filter. Because the content of the window varies over time, the coefficient and therefore the filter varies (or adapts) to remove at least a portion of the spectral tilt. As a result, the synthesized speech signal sounds brighter without having introduced significant coding noise.

    摘要翻译: 多速率语音编解码器通过自适应地选择编码比特率模式以匹配通信信道限制来支持多种编码比特率模式。 在较高的比特率编码模式中,通过CELP(码激励线性预测)和其他相关联的建模参数的语音的精确表示被生成用于更高质量的解码和再现。 为了在低比特率编码模式下实现高质量,语音编码器脱离了常规CELP编码器的严格波形匹配标准,并努力识别输入信号的重要感知特征。 为了支持较低比特率编码模式,应用了许多技术,其中许多技术涉及输入信号的分类。 对于所选择的每个比特率模式,选择多个固定或创新子码本来用于产生创新向量。 在较低的编码比特率下,解码器利用自适应补偿来尝试校正加权合成残差中的频谱变化。 虽然许多方法是可能的,但是对合成残差应用长非对称窗口以产生在一阶滤波器中被平滑,缩放和使用的反射系数。 因为窗口的内容随时间而变化,所以系数因此滤波器变化(或适应)以去除光谱倾斜的至少一部分。 结果,合成的语音信号听起来更亮,没有引入显着的编码噪声。

    Adaptive tilt compensation for synthesized speech
    4.
    发明授权
    Adaptive tilt compensation for synthesized speech 有权
    合成语音的自适应倾斜补偿

    公开(公告)号:US09401156B2

    公开(公告)日:2016-07-26

    申请号:US12215649

    申请日:2008-06-27

    申请人: Huan-Yu Su Yang Gao

    发明人: Huan-Yu Su Yang Gao

    摘要: There is provided a method of using an adaptive tilt compensation by a speech decoder. The method comprises receiving a bit stream including a plurality of parameters representative of a speech signal; identifying an adaptive code vector and a fixed code vector using the plurality of parameters; scaling the adaptive code vector and the fixed code vector to generate a scaled adaptive code vector and a scaled fixed code vector; summing the scaled adaptive code vector and the scaled fixed code vector to generate a synthesized output; calculating a first reflection coefficient based on the plurality of parameters representative of the speech signal; multiplying the first reflection coefficient by a factor to generate a tilt factor; and applying the tilt factor to the synthesized output based on an encoding bit rate.

    摘要翻译: 提供了一种通过语音解码器使用自适应倾斜补偿的方法。 该方法包括:接收包括表示语音信号的多个参数的比特流; 使用所述多个参数来识别自适应码矢量和固定码矢量; 缩放自适应码矢量和固定码矢量以生成缩放的自适应码矢量和缩放的固定码矢量; 对经缩放的自适应码矢量和缩放的固定码矢量求和以产生合成输出; 基于表示所述语音信号的多个参数来计算第一反射系数; 将第一反射系数乘以因子以产生倾斜因子; 以及基于编码比特率将所述倾斜因子应用于所述合成输出。

    Adaptive tilt compensation for synthesized speech
    5.
    发明申请
    Adaptive tilt compensation for synthesized speech 有权
    合成语音的自适应倾斜补偿

    公开(公告)号:US20080294429A1

    公开(公告)日:2008-11-27

    申请号:US12215649

    申请日:2008-06-27

    申请人: Huan-Yu Su Yang Gao

    发明人: Huan-Yu Su Yang Gao

    IPC分类号: G10L19/12

    摘要: There is provided a method of using an adaptive tilt compensation by a speech decoder. The method comprises receiving a bit stream including a plurality of parameters representative of a speech signal; identifying an adaptive code vector and a fixed code vector using the plurality of parameters; scaling the adaptive code vector and the fixed code vector to generate a scaled adaptive code vector and a scaled fixed code vector; summing the scaled adaptive code vector and the scaled fixed code vector to generate a synthesized output; calculating a first reflection coefficient based on the plurality of parameters representative of the speech signal; multiplying the first reflection coefficient by a factor to generate a tilt factor; and applying the tilt factor to the synthesized output based on an encoding bit rate.

    摘要翻译: 提供了一种通过语音解码器使用自适应倾斜补偿的方法。 该方法包括:接收包括表示语音信号的多个参数的比特流; 使用所述多个参数来识别自适应码矢量和固定码矢量; 缩放自适应码矢量和固定码矢量以生成缩放的自适应码矢量和缩放的固定码矢量; 对经缩放的自适应码矢量和缩放的固定码矢量求和以产生合成输出; 基于表示所述语音信号的多个参数来计算第一反射系数; 将第一反射系数乘以因子以产生倾斜因子; 以及基于编码比特率将所述倾斜因子应用于所述合成输出。

    Selection of preferential pitch value for speech processing
    6.
    发明申请
    Selection of preferential pitch value for speech processing 审中-公开
    选择语音处理的优先音调值

    公开(公告)号:US20080288246A1

    公开(公告)日:2008-11-20

    申请号:US12220480

    申请日:2008-07-23

    申请人: Huan-Yu Su Yang Gao

    发明人: Huan-Yu Su Yang Gao

    IPC分类号: G10L11/04

    摘要: There is provided a method of using a processing circuitry for selecting a preferential pitch lag value from a plurality of pitch lag values, including a first pitch lag value and a second pitch lag value, for coding an input speech signal. The method comprises determining a first timing relationship between a previous pitch lag value and at least one of the plurality of pitch lag values; determining a second timing relationship between the first pitch lag value and the second pitch lag value; favoring one of the first pitch lag value and the second pitch lag value based on the first timing relationship and the second timing relationship to select one of the first pitch lag value and the second pitch lag value as the preferential pitch lag value; and converting the input speech signal into an encoded speech using the preferential pitch lag value.

    摘要翻译: 提供了一种使用处理电路的方法,用于从包括第一音调滞后值和第二音调滞后值的多个音调滞后值中选择用于编码输入语音信号的优先音调滞后值。 该方法包括确定先前的音调滞后值与多个音调滞后值中的至少一个之间的第一定时关系; 确定所述第一音调滞后值和所述第二音调滞后值之间的第二定时关系; 基于第一定时关系和第二定时关系,优选第一音调滞后值和第二音调滞后值中的一个,以选择第一音调滞后值和第二音调滞后值之一作为优先音调滞后值; 以及使用优先音调滞后值将输入语音信号转换为编码语音。

    System for speech encoding having an adaptive encoding arrangement
    7.
    发明申请
    System for speech encoding having an adaptive encoding arrangement 审中-公开
    具有自适应编码装置的语音编码系统

    公开(公告)号:US20070255561A1

    公开(公告)日:2007-11-01

    申请号:US11827915

    申请日:2007-07-12

    申请人: Huan-Yu Su Yang Gao

    发明人: Huan-Yu Su Yang Gao

    IPC分类号: G10L21/00

    摘要: In accordance with one aspect of the invention, a selector supports the selection of a first encoding scheme or the second encoding scheme based upon the detection or absence of the triggering characteristic in the interval of the input speech signal. The first encoding scheme has a pitch pre-processing procedure for processing the input speech signal to form a revised speech signal biased toward an ideal voiced and stationary characteristic. The pre-processing procedure allows the encoder to fully capture the benefits of a bandwidth-efficient, long-term predictive procedure for a greater amount of speech components of an input speech signal than would otherwise be possible. In accordance with another aspect of the invention, the second encoding scheme entails a long-term prediction mode for encoding the pitch on a sub-frame by sub-frame basis. The long-term prediction mode is tailored to where the generally periodic component of the speech is generally not stationary or less than completely periodic and requires greater frequency of updates from the adaptive codebook to achieve a desired perceptual quality of the reproduced speech under a long-term predictive procedure.

    摘要翻译: 根据本发明的一个方面,选择器基于输入语音信号的间隔中的触发特性的检测或不存在,支持选择第一编码方案或第二编码方案。 第一编码方案具有用于处理输入语音信号以形成偏向理想有声和静态特征的修正语音信号的音调预处理过程。 预处理过程允许编码器完全捕获带宽有效的长期预测程序的优点,用于输入语音信号的大量语音分量比否则可能的更多。 根据本发明的另一方面,第二编码方案需要一种长期预测模式,用于以子帧为基础对子帧上的音调进行编码。 长期预测模式被定制为语音的大致周期性分量通常不是静止的或小于完全周期性的,并且需要来自自适应码本的更高频率的更新以在长时间内实现再现语音的期望感知质量, 术语预测程序。

    Coding based on spectral content of a speech signal
    8.
    发明授权
    Coding based on spectral content of a speech signal 有权
    基于语音信号的频谱内容进行编码

    公开(公告)号:US06937979B2

    公开(公告)日:2005-08-30

    申请号:US09896682

    申请日:2001-06-29

    申请人: Yang Gao Huan-Yu Su

    发明人: Yang Gao Huan-Yu Su

    IPC分类号: G10L19/14 G10L21/02 G10L19/00

    摘要: In a coding procedure, a spectral content of a speech signal is estimated. A preferential coding algorithm or preferential value of at least one coding parameter is selected based on the estimated spectral content of the speech signal. The speech signal is coded in accordance with the selected coding algorithm or the selected coding parameter to control the operation of one or more of the following: a pre-processing filter, a post-processing filter, a coding control coefficient, a weighting filter, a synthesis filter, and a quantization table.

    摘要翻译: 在编码过程中,估计语音信号的频谱内容。 基于所估计的语音信号的频谱内容来选择优选编码算法或至少一个编码参数的优先值。 语音信号根据所选择的编码算法或选择的编码参数进行编码,以控制以下一个或多个的操作:预处理滤波器,后处理滤波器,编码控制系数,加权滤波器, 合成滤波器和量化表。

    Codebook sharing for LSF quantization
    9.
    发明申请
    Codebook sharing for LSF quantization 有权
    LSF量化的码本共享

    公开(公告)号:US20090164210A1

    公开(公告)日:2009-06-25

    申请号:US12321950

    申请日:2009-01-26

    申请人: Huan-Yu Su Yang Gao

    发明人: Huan-Yu Su Yang Gao

    IPC分类号: G10L19/14

    摘要: In accordance with one aspect of the invention, a selector supports the selection of a first encoding scheme or the second encoding scheme based upon the detection or absence of the triggering characteristic in the interval of the input speech signal. The first encoding scheme has a pitch pre-processing procedure for processing the input speech signal to form a revised speech signal biased toward an ideal voiced and stationary characteristic. The pre-processing procedure allows the encoder to fully capture the benefits of a bandwidth-efficient, long-term predictive procedure for a greater amount of speech components of an input speech signal than would otherwise be possible. In accordance with another aspect of the invention, the second encoding scheme entails a long-term prediction mode for encoding the pitch on a sub-frame by sub-frame basis. The long-term prediction mode is tailored to where the generally periodic component of the speech is generally not stationary or less than completely periodic and requires greater frequency of updates from the adaptive codebook to achieve a desired perceptual quality of the reproduced speech under a long-term predictive procedure.

    摘要翻译: 根据本发明的一个方面,选择器基于输入语音信号的间隔中的触发特性的检测或不存在,支持选择第一编码方案或第二编码方案。 第一编码方案具有用于处理输入语音信号以形成偏向理想有声和静态特征的修正语音信号的音调预处理过程。 预处理过程允许编码器完全捕获带宽有效的长期预测程序的优点,用于输入语音信号的大量语音分量比否则可能的更多。 根据本发明的另一方面,第二编码方案需要一种长期预测模式,用于以子帧为基础对子帧上的音调进行编码。 长期预测模式被定制为语音的大致周期性分量通常不是静止的或小于完全周期性的,并且需要来自自适应码本的更高频率的更新以在长时间内实现再现语音的期望感知质量, 术语预测程序。

    Adaptive gain reduction for encoding a speech signal
    10.
    发明申请
    Adaptive gain reduction for encoding a speech signal 有权
    用于对语音信号进行编码的自适应增益减小

    公开(公告)号:US20080319740A1

    公开(公告)日:2008-12-25

    申请号:US12218242

    申请日:2008-07-11

    申请人: Huan-Yu Su Yang Gao

    发明人: Huan-Yu Su Yang Gao

    IPC分类号: G10L19/14

    摘要: There is provided a method of encoding an input speech signal. The method comprises identifying a fixed codebook vector from a fixed codebook; identifying an adaptive codebook vector from a adaptive codebook; calculating an adaptive codebook gain; reducing the adaptive codebook gain by an amount; optimally selecting a fixed codebook gain based on the adaptive codebook gain while both the fixed codebook vector and the adaptive codebook vector remain fixed; and converting the input speech signal into an encoded speech using the fixed codebook gain, the adaptive codebook gain, the fixed codebook vector and the adaptive codebook vector. The amount of reducing the adaptive codebook gain may be varied.

    摘要翻译: 提供了一种对输入语音信号进行编码的方法。 该方法包括从固定码本识别固定码本向量; 从自适应码本识别自适应码本向量; 计算自适应码本增益; 将自适应码本增益减少一定量; 在固定码本矢量和自适应码本矢量保持固定的同时,基于自适应码本增益最优选择固定码本增益; 以及使用固定码本增益,自适应码本增益,固定码本矢量和自适应码本矢量将输入语音信号转换为编码语音。 降低自适应码本增益的量可以变化。