Method and device for frequency-selective pitch enhancement of synthesized speech
    1.
    发明授权
    Method and device for frequency-selective pitch enhancement of synthesized speech 有权
    合成语音频率选择音调增强的方法和装置

    公开(公告)号:US07529660B2

    公开(公告)日:2009-05-05

    申请号:US10515553

    申请日:2003-05-30

    IPC分类号: G10L19/02 G10L21/02

    摘要: In a method and device for post-processing a decoded sound signal in view of enhancing a perceived quality of this decoded sound signal, the decoded sound signal is divided into a plurality of frequency sub-band signals, and post-processing is applied to at least one of the frequency sub-band signal. After post-processing of this at least one frequency sub-band signal, the frequency sub-band signals may be added to produce an output post-processed decoded sound signal. In this manner, the post-processing can be localized to a desired sub-band or sub-bands with leaving other sub-bands virtually unaltered.

    摘要翻译: 考虑到提高该解码声音信号的感知质量,对解码声音信号进行后处理的方法和装置中,解码声音信号被分成多个频率子带信号,后处理应用于 至少一个频率子带信号。 在对该至少一个频率子带信号进行后处理之后,可以添加频率子带信号以产生输出的后处理解码声音信号。 以这种方式,后处理可以被定位到期望的子带或子带,而使其他子带几乎不变。

    Method and device for frequency-selective pitch enhancement of synthesized speech
    2.
    发明申请
    Method and device for frequency-selective pitch enhancement of synthesized speech 有权
    合成语音频率选择音调增强的方法和装置

    公开(公告)号:US20050165603A1

    公开(公告)日:2005-07-28

    申请号:US10515553

    申请日:2003-05-30

    摘要: In a method and device for post-processing a decoded sound signal in view of enhancing a perceived quality of this decoded sound signal, the decoded sound signal is divided into a plurality of frequency sub-band signals, and post-processing is applied to at least one of the frequency sub-band signal. After post-processing of this at least one frequency sub-band signal, the frequency sub-band signals may be added to produce an output post-processed decoded sound signal. In this manner, the post-processing can be localized to a desired sub-band or sub-bands with leaving other sub-bands virtually unaltered.

    摘要翻译: 考虑到提高该解码声音信号的感知质量,对解码声音信号进行后处理的方法和装置中,解码声音信号被分成多个频率子带信号,后处理应用于 至少一个频率子带信号。 在对该至少一个频率子带信号进行后处理之后,可以添加频率子带信号以产生输出的后处理解码声音信号。 以这种方式,后处理可以被定位到期望的子带或子带,而使其他子带几乎不变。

    Signal modification method for efficient coding of speech signals
    3.
    发明申请
    Signal modification method for efficient coding of speech signals 有权
    用于语音信号有效编码的信号修改方法

    公开(公告)号:US20090063139A1

    公开(公告)日:2009-03-05

    申请号:US12288592

    申请日:2008-10-21

    IPC分类号: G10L11/04

    CPC分类号: G10L19/08

    摘要: For determining a long-term-prediction delay parameter characterizing a long term prediction in a technique using signal modification for digitally encoding a sound signal, the sound signal is divided into a series of successive frames, a feature of the sound signal is located in a previous frame, a corresponding feature of the sound signal is located in a current frame, and the long-term-prediction delay parameter is determined for the current frame while mapping, with the long term prediction, the signal feature of the previous frame with the corresponding signal feature of the current frame. In a signal modification method for implementation into a technique for digitally encoding a sound signal, the sound signal is divided into a series of successive frames, each frame of the sound signal is partitioned into a plurality of signal segments, and at least a part of the signal segments of the frame are warped while constraining the warped signal segments inside the frame. For searching pitch pulses in a sound signal, a residual signal is produced by filtering the sound signal through a linear prediction analysis filter, a weighted sound signal is produced by processing the sound signal through a weighting filter, the weighted sound signal being indicative of signal periodicity, a synthesized weighted sound signal is produced by filtering a synthesized speech signal produced during a last subframe of a previous frame of the sound signal through the weighting filter, a last pitch pulse of the sound signal of the previous frame is located from the residual signal, a pitch pulse prototype of given length is extracted around the position of the last pitch pulse of the sound signal of the previous frame using the synthesized weighted sound signal, and the pitch pulses are located in a current frame using the pitch pulse prototype.

    摘要翻译: 为了确定在使用用于数字编码声音信号的信号修改的技术中表征长期预测的长期预测延迟参数,声音信号被分成一系列连续的帧,声音信号的特征位于 前一帧,声音信号的对应特征位于当前帧中,并且为当前帧确定长期预测延迟参数,同时长期预测将前一帧的信号特征与 当前帧的相应信号特征。 在用于实现用于对声音信号进行数字编码的技术的信号修改方法中,声音信号被分成一系列连续的帧,声音信号的每个帧被划分为多个信号段,并且至少一部分 框架的信号段扭曲,同时约束框架内的翘曲的信号段。 为了在声音信号中搜索音调脉冲,通过线性预测分析滤波器对声音信号进行滤波来产生残留信号,通过加权滤波器处理声音信号产生加权声音信号,加权声音信号表示信号 通过对通过加权滤波器的声音信号的先前帧的最后一个子帧产生的合成语音信号进行滤波,产生合成加权声音信号,将前一帧的声音信号的最后音调脉冲从剩余的位置 信号,使用合成的加权声音信号在前一帧的声音信号的最后音调脉冲的位置周围提取给定长度的音调脉冲原型,并且使用音调脉冲原型将音调脉冲位于当前帧中。

    Signal modification method for efficient coding of speech signals
    4.
    发明授权
    Signal modification method for efficient coding of speech signals 有权
    用于语音信号有效编码的信号修改方法

    公开(公告)号:US08121833B2

    公开(公告)日:2012-02-21

    申请号:US12288592

    申请日:2008-10-21

    IPC分类号: G10L19/00

    CPC分类号: G10L19/08

    摘要: The exemplary embodiments of the invention provide at least a method and an apparatus to perform operations including dividing a sound signal into a series of successive frames, dividing each frame into a number of subframes, producing a residual signal by filtering the sound signal through a linear prediction analysis filter, locating a last pitch pulse of the sound signal of a previous frame from the residual signal, extracting a pitch pulse prototype of given length around a position of the last pitch pulse of the previous frame using the residual signal, and locating pitch pulses in a current frame using the pitch pulse prototype.

    摘要翻译: 本发明的示例性实施例至少提供了一种执行操作的方法和装置,包括将声音信号划分为一系列连续的帧,将每个帧划分成多个子帧,通过线性化滤波声音信号产生残余信号 预测分析滤波器,从剩余信号定位前一帧的声音信号的最后音调脉冲,使用剩余信号提取在前一帧的最后音调脉冲的位置周围的给定长度的音调脉冲原型,以及定位音调 使用音调脉冲原型在当前帧中的脉冲。

    Signal modification method for efficient coding of speech signals
    5.
    发明申请
    Signal modification method for efficient coding of speech signals 有权
    用于语音信号有效编码的信号修改方法

    公开(公告)号:US20050071153A1

    公开(公告)日:2005-03-31

    申请号:US10498254

    申请日:2002-12-13

    IPC分类号: G10L19/12 G10L19/10

    CPC分类号: G10L19/08

    摘要: For determining a long-term-prediction delay parameter characterizing a long term prediction in a technique using signal modification for digitally encoding a sound signal, the sound signal is divided into a series of successive frames, a feature of the sound signal is located in a previous frame, a corresponding feature of the sound signal is located in a current frame, and the long-term-prediction delay parameter is determined for the current frame while mapping, with the long term prediction, the signal feature of the previous frame with the corresponding signal feature of the current frame. In a signal modification method for implementation into a technique for digitally encoding a sound signal, the sound signal is divided into a series of successive frames, each frame of the sound signal is partitioned into a plurality of signal segments, and at least a part of the signal segments of the frame are warped while constraining the warped signal segments inside the frame. For searching pitch pulses in a sound signal, a residual signal is produced by filtering the sound signal through a linear prediction analysis filter, a weighted sound signal is produced by processing the sound signal through a weighting filter, the weighted sound signal being indicative of signal periodicity, a synthesized weighted sound signal is produced by filtering a synthesized speech signal produced during a last subframe of a previous frame of the sound signal through the weighting filter, a last pitch pulse of the sound signal of the previous frame is located from the residual signal, a pitch pulse prototype of given length is extracted around the position of the last pitch pulse of the sound signal of the previous frame using the synthesized weighted sound signal, and the pitch pulses are located in a current frame using the pitch pulse prototype.

    摘要翻译: 为了确定在使用用于数字编码声音信号的信号修改的技术中表征长期预测的长期预测延迟参数,声音信号被分成一系列连续的帧,声音信号的特征位于 前一帧,声音信号的对应特征位于当前帧中,并且为当前帧确定长期预测延迟参数,同时长期预测将前一帧的信号特征与 当前帧的相应信号特征。 在用于实现用于对声音信号进行数字编码的技术的信号修改方法中,声音信号被分成一系列连续的帧,声音信号的每个帧被划分为多个信号段,并且至少一部分 框架的信号段扭曲,同时约束框架内的翘曲的信号段。 为了在声音信号中搜索音调脉冲,通过线性预测分析滤波器对声音信号进行滤波来产生残留信号,通过加权滤波器处理声音信号产生加权声音信号,加权声音信号表示信号 通过对通过加权滤波器的声音信号的先前帧的最后一个子帧产生的合成语音信号进行滤波,产生合成加权声音信号,将前一帧的声音信号的最后音调脉冲从剩余的位置 信号,使用合成的加权声音信号在前一帧的声音信号的最后音调脉冲的位置周围提取给定长度的音调脉冲原型,并且使用音调脉冲原型将音调脉冲位于当前帧中。

    Signal modification method for efficient coding of speech signals
    6.
    发明授权
    Signal modification method for efficient coding of speech signals 有权
    用于语音信号有效编码的信号修改方法

    公开(公告)号:US07680651B2

    公开(公告)日:2010-03-16

    申请号:US10498254

    申请日:2002-12-13

    IPC分类号: G10L19/00

    CPC分类号: G10L19/08

    摘要: In accordance with the exemplary embodiments of the invention there is disclosed at least a method and apparatus for determining a long-term-prediction delay parameter characterizing a long term prediction in a technique using signal modification for digitally encoding a sound signal, the sound signal is divided into a series of successive frames, a feature of the sound signal is located in a previous frame, a corresponding feature of the sound signal is located in a current frame, and the long-term-prediction delay parameter is determined for the current frame while mapping, with the long term prediction, the signal feature of the previous frame with the corresponding signal feature of the current frame. Each divided frame of the sound signal is partitioned into a plurality of signal segments, and at least a part of the signal segments of the frame are warped while constraining the warped signal segments inside the frame.

    摘要翻译: 根据本发明的示例性实施例,至少公开了一种用于在使用用于数字编码声音信号的信号修改的技术的技术中确定表征长期预测的长期预测延迟参数的方法和装置,声音信号是 分为一系列连续帧,声信号的特征位于先前帧中,声信号的对应特征位于当前帧中,并且为当前帧确定长期预测延迟参数 同时用长期预测将前一帧的信号特征与当前帧的对应信号特征进行映射。 声音信号的每个分割帧被划分成多个信号段,并且框架的信号段的至少一部分变形,同时约束框架内的扭曲信号段。

    System and method for adaptive transmission of comfort noise parameters during discontinuous speech transmission
    7.
    发明授权
    System and method for adaptive transmission of comfort noise parameters during discontinuous speech transmission 有权
    用于在不连续语音传输期间自适应传输舒适噪声参数的系统和方法

    公开(公告)号:US07693708B2

    公开(公告)日:2010-04-06

    申请号:US11424365

    申请日:2006-06-15

    IPC分类号: G10L19/12

    CPC分类号: G10L19/012 G10L19/24

    摘要: Apparatus is provided that includes at least one entity for transmitting speech signals in a discontinuous transmission mode including transmitting speech frames interspersed with frames including comfort noise parameters during periods of speech pauses. The entit(ies) include a first entity for estimating a current noise value. In addition, the apparatus includes a second entity for selectively controlling a rate at which the frames including comfort noise parameters are transmitted during the periods of speech pauses based upon the estimated current noise value.

    摘要翻译: 提供了包括用于以不连续传输模式发送语音信号的至少一个实体的装置,包括在语音暂停期间包括散布有包括舒适噪声参数的帧的语音帧。 该权限包括用于估计当前噪声值的第一实体。 此外,该设备包括第二实体,用于根据估计的当前噪声值选择性地控制在语音暂停期间发送包括舒适噪声参数的帧的速率。

    Methods and devices for source controlled variable bit-rate wideband speech coding
    8.
    发明申请
    Methods and devices for source controlled variable bit-rate wideband speech coding 有权
    用于源控制的可变比特率宽带语音编码的方法和装置

    公开(公告)号:US20050177364A1

    公开(公告)日:2005-08-11

    申请号:US11039539

    申请日:2005-01-19

    申请人: Milan Jelinek

    发明人: Milan Jelinek

    IPC分类号: G10L11/06 G10L19/00 G10L19/14

    摘要: Speech signal classification and encoding systems and methods are disclosed herein. The signal classification is done in three steps each of them discriminating a specific signal class. First, a voice activity detector (VAD) discriminates between active and inactive speech frames. If an inactive speech frame is detected (background noise signal) then the classification chain ends and the frame is encoded with comfort noise generation (CNG). If an active speech frame is detected, the frame is subjected to a second classifier dedicated to discriminate unvoiced frames. If the classifier classifies the frame as unvoiced speech signal, the classification chain ends, and the frame is encoded using a coding method optimized for unvoiced signals. Otherwise, the speech frame is passed through to the “stable voiced” classification module. If the frame is classified as stable voiced frame, then the frame is encoded using a coding method optimized for stable voiced signals. Otherwise, the frame is likely to contain a non-stationary speech segment such as a voiced onset or rapidly evolving voiced speech signal. In this case a general-purpose speech coder is used at a high bit rate for sustaining good subjective quality.

    摘要翻译: 本文公开了语音信号分类和编码系统和方法。 信号分类通过三个步骤完成,每个步骤区分特定的信号类别。 首先,语音活动检测器(VAD)在有效和无效的语音帧之间进行区分。 如果检测到无效语音帧(背景噪声信号),则分类链结束,并且以舒适噪声产生(CNG)编码该帧。 如果检测到活动语音帧,则该帧经受专用于区分清音帧的第二分类器。 如果分类器将帧分类为无声语音信号,则分类链结束,并且使用针对无声信号优化的编码方法对帧进行编码。 否则,将语音帧传递到“稳定浊音”分类模块。 如果帧被分类为稳定的有声帧,则使用针对稳定浊音信号优化的编码方法对帧进行编码。 否则,该帧可能包含诸如有声开始或快速演进的有声语音信号之类的非平稳语音段。 在这种情况下,通用语音编码器以高比特率被使用以维持良好的主观质量。

    Method and device for speech enhancement in the presence of background noise
    9.
    发明授权
    Method and device for speech enhancement in the presence of background noise 有权
    有背景噪音的语音增强方法和装置

    公开(公告)号:US08577675B2

    公开(公告)日:2013-11-05

    申请号:US11021938

    申请日:2004-12-22

    申请人: Milan Jelinek

    发明人: Milan Jelinek

    CPC分类号: G10L21/0208

    摘要: In one aspect thereof the invention provides a method for noise suppression of a speech signal that includes, for a speech signal having a frequency domain representation dividable into a plurality of frequency bins, determining a value of a scaling gain for at least some of said frequency bins and calculating smoothed scaling gain values. Calculating smoothed scaling gain values includes, for the at least some of the frequency bins, combining a currently determined value of the scaling gain and a previously determined value of the smoothed scaling gain. In another aspect a method partitions the plurality of frequency bins into a first set of contiguous frequency bins and a second set of contiguous frequency bins having a boundary frequency there between, where the boundary frequency differentiates between noise suppression techniques, and changes a value of the boundary frequency as a function of the spectral content of the speech signal.

    摘要翻译: 在其一个方面,本发明提供了一种用于语音信号的噪声抑制的方法,所述方法包括:对于具有可分为多个频率仓的频域表示的语音信号,确定所述频率中的至少一些的缩放增益的值 bin并计算平滑的缩放增益值。 对于平滑的缩放增益值的计算包括对于至少一些频率仓,组合当前确定的缩放增益的值和预先确定的平滑缩放增益的值。 在另一方面,一种方法将多个频率仓划分为第一组连续频率仓和第二组连续频率仓,其间具有边界频率,其中边界频率区分噪声抑制技术,并且改变 边界频率作为语音信号的频谱内容的函数。

    Method and device for efficient in-band dim-and-burst signaling and half-rate max operation in variable bit-rate wideband speech coding for CDMA wireless systems
    10.
    发明授权
    Method and device for efficient in-band dim-and-burst signaling and half-rate max operation in variable bit-rate wideband speech coding for CDMA wireless systems 有权
    用于CDMA无线系统的可变比特率宽带语音编码中的有效带内暗和信令信令和半速率最大运算的方法和装置

    公开(公告)号:US08224657B2

    公开(公告)日:2012-07-17

    申请号:US10520374

    申请日:2003-06-27

    IPC分类号: G10L19/00 G10L21/04

    CPC分类号: G10L19/24

    摘要: In the method and device for interoperating a first station using a first communication scheme and comprising a first coder and a first decoder with a second station using a second communication scheme and comprising a second coder and a second decoder, communication between the first and second stations is conducted by transmitting signal-coding parameters related to a sound signal from the coder of one of the first and second stations to the decoder of the other station. The sound signal is classified to determine whether the signal-coding parameters should be transmitted from the coder of one station to the decoder of the other station using a first communication mode in which full bit rate is used for transmission of the signal-coding parameters. When classification of the sound signal determines that the signal-coding parameters should be transmitted using the first communication mode and when a request to transmit the signal-coding parameters from the coder of one station to the decoder of the other station using a second communication mode designed to reduce bit rate during transmission of the signal-coding parameters is received, a portion of the signal-coding parameters from the coder one station is dropped and the remaining signal-coding parameters are transmitting to the decoder of the other station using the second communication mode. The dropped portion of the signal-coding parameters are regenerated before the decoder of the other station decodes the signal-coding parameters.

    摘要翻译: 在用于使用第一通信方案互操作第一站的方法和设备中,包括第一编码器和具有第二站的第一解码器,并且包括第二编码器和第二解码器,第一和第二站之间的通信 通过将与来自第一和第二站中的一个的编码器的声音信号相关的信号编码参数发送到另一站的解码器来进行。 声音信号被分类以确定信号编码参数是否应当使用全位比特率用于传输信号编码参数的第一通信模式从一个站的编码器发送到另一站的解码器。 当声音信号的分类确定应当使用第一通信模式发送信号编码参数时,以及当使用第二通信模式从一个站的编码器向另一站的解码器发送信号编码参数的请求时 被设计为在信号编码参数的传输期间降低比特率被接收到,来自编码器一个站的信号编码参数的一部分被丢弃,剩下的信号编码参数使用第二个信号编码参数传送到另一台的解码器 通讯模式。 信号编码参数的丢弃部分在另一站的解码器解码信号编码参数之前被再生。