EMBEDDED SILENCE AND BACKGROUND NOISE COMPRESSION
    1.
    发明申请
    EMBEDDED SILENCE AND BACKGROUND NOISE COMPRESSION 审中-公开
    嵌入式静音和背景噪声压缩

    公开(公告)号:WO2008100385A3

    公开(公告)日:2009-04-23

    申请号:PCT/US2008001356

    申请日:2008-02-01

    CPC classification number: G10L19/24 G10L19/012 G10L19/0208

    Abstract: There is provided a method for use by a speech encoder to encode an input speech signal. The method comprises receiving the input speech signal; determining whether the input speech signal includes an active speech signal or an inactive speech signal; low-pass filtering the inactive speech signal to generate a narrowband inactive speech signal; high-pass filtering the inactive speech signal to generate a high-band inactive speech signal; encoding the narrowband inactive speech signal using a narrowband inactive speech encoder to generate an encoded narrowband inactive speech; generating a low-to-high auxiliary signal by the narrowband inactive speech encoder based on the narrowband inactive speech signal; encoding the high-band inactive speech signal using a wideband inactive speech encoder to generate an encoded wideband inactive speech based on the low-to-high auxiliary signal from the narrowband inactive speech encoder; and transmitting the encoded narrowband inactive speech and the encoded wideband inactive speech.

    Abstract translation: 提供了一种由语音编码器用于对输入语音信号进行编码的方法。 该方法包括接收输入语音信号; 确定所述输入语音信号是否包括活动语音信号或无效语音信号; 低通滤波无效语音信号以产生窄带无效语音信号; 高通滤波无效语音信号以产生高频带无效语音信号; 使用窄带无源语音编码器对窄带无源语音信号进行编码,以生成编码窄带无效语音; 基于窄带无效语音信号,由窄带无源语音编码器生成低到高的辅助信号; 使用宽带无源语音编码器对高频带无效语音信号进行编码,以根据来自窄带无源语音编码器的低到高辅助信号生成编码的宽带无效语音; 以及发送编码的窄带无效语音和编码的宽带无效语音。

    METHOD AND SYSTEM FOR REDUCING EFFECTS OF NOISE PRODUCING ARTIFACTS IN A VOICE CODEC
    2.
    发明申请
    METHOD AND SYSTEM FOR REDUCING EFFECTS OF NOISE PRODUCING ARTIFACTS IN A VOICE CODEC 审中-公开
    在语音编码中降低噪音产生作用的效果的方法和系统

    公开(公告)号:WO2007111645A3

    公开(公告)日:2008-10-02

    申请号:PCT/US2006041434

    申请日:2006-10-23

    CPC classification number: H03G3/3089 G10L21/045 H03G3/341

    Abstract: There is provided a method of reducing effect of noise producing artifacts in a speech signal. The method comprises obtaining (310) a plurality of incoming samples of a speech subframe; summing (310) an energy level for each of the plurality of incoming samples to generate a total input level; comparing (320) the total input level with a predetermined threshold; setting (340) a gain value as a function of the total input level, where the gain value is between zero (0) and one (1), and where the function results in a lower gain value when the total input level is indicative of a silence area than when the total input level is indicative of a non-silence area; and multiplying (350) the plurality of incoming samples of the speech subframe by the gain value.

    Abstract translation: 提供了一种降低语音信号中产生噪声的伪影效果的方法。 该方法包括获取(310)语音子帧的多个输入样本; 对(310)多个输入样本中的每一个的能级求和(310)以产生总输入电平; 将总输入电平与预定阈值进行比较(320); 设置(340)作为总输入电平的函数的增益值,其中增益值在零(0)和一(1)之间,并且当总输入电平指示为 一个沉默区域,而不是总输入水平表示一个非沉默区域; 以及将所述语音子帧的所述多个输入采样乘以(350)所述增益值。

    EMBEDDED SILENCE AND BACKGROUND NOISE COMPRESSION
    3.
    发明申请
    EMBEDDED SILENCE AND BACKGROUND NOISE COMPRESSION 审中-公开
    嵌入式静音和背景噪声压缩

    公开(公告)号:WO2008100385A2

    公开(公告)日:2008-08-21

    申请号:PCT/US2008001356

    申请日:2008-02-01

    CPC classification number: G10L19/24 G10L19/012 G10L19/0208

    Abstract: There is provided a method for use by a speech encoder to encode an input speech signal. The method comprises receiving the input speech signal; determining whether the input speech signal includes an active speech signal or an inactive speech signal; low-pass filtering the inactive speech signal to generate a narrowband inactive speech signal; high-pass filtering the inactive speech signal to generate a high-band inactive speech signal; encoding the narrowband inactive speech signal using a narrowband inactive speech encoder to generate an encoded narrowband inactive speech; generating a low-to-high auxiliary signal by the narrowband inactive speech encoder based on the narrowband inactive speech signal; encoding the high-band inactive speech signal using a wideband inactive speech encoder to generate an encoded wideband inactive speech based on the low-to-high auxiliary signal from the narrowband inactive speech encoder; and transmitting the encoded narrowband inactive speech and the encoded wideband inactive speech.

    Abstract translation: 提供了一种由语音编码器用于对输入语音信号进行编码的方法。 该方法包括接收输入语音信号; 确定所述输入语音信号是否包括活动语音信号或无效语音信号; 低通滤波无效语音信号以产生窄带无效语音信号; 高通滤波无效语音信号以产生高频带无效语音信号; 使用窄带无源语音编码器对窄带无源语音信号进行编码,以生成编码窄带无效语音; 基于窄带无效语音信号,由窄带无源语音编码器生成低到高的辅助信号; 使用宽带无源语音编码器对高频带无效语音信号进行编码,以根据来自窄带无源语音编码器的低到高辅助信号生成编码的宽带无效语音; 以及发送编码的窄带无效语音和编码的宽带无效语音。

    OPEN-LOOP PITCH TRACK SMOOTHING
    4.
    发明申请

    公开(公告)号:WO2007111649A2

    公开(公告)日:2007-10-04

    申请号:PCT/US2006042096

    申请日:2006-10-27

    Inventor: GAO YANG

    CPC classification number: G10L25/90

    Abstract: There is provided a speech encoder for performing an algorithm that comprises obtaining (205) a plurality of open-loop pitch candidates from a current frame of a speech signal, the plurality of open-loop pitch candidates including a first open-loop pitch candidate and a second open-loop pitch candidate; obtaining (205) a voicing information from one or more previous frames; and selecting (280) one of the plurality of open-loop pitch candidates as a final pitch of the current frame using the voicing information from the one or more previous frames. In one aspect, the voicing information from the one or more previous frames includes a previous pitch of the one or more previous frames. In a further aspect, selecting the final pitch of the current frame includes selecting (210) an initial open-loop pitch from that has the maximum long-term correlation value.

    Abstract translation: 提供了一种用于执行算法的语音编码器,该算法包括从语音信号的当前帧获得(205)多个开环音调候选,所述多个开环音调候选包括第一开环音调候选和 第二个开环音调候选者; 从一个或多个先前帧获取(205)发声信息; 以及使用来自所述一个或多个先前帧的所述语音信息来选择(280)所述多个开环音调候选中的一个作为所述当前帧的最终音调。 在一个方面,来自一个或多个先前帧的发音信息包括一个或多个先前帧的先前音调。 在另一方面,选择当前帧的最终音调包括从具有最大长期相关值的初始开环音调中选择(210)。

    ADAPTIVE VOICE MODE EXTENSION FOR A VOICE ACTIVITY DETECTOR
    5.
    发明申请
    ADAPTIVE VOICE MODE EXTENSION FOR A VOICE ACTIVITY DETECTOR 审中-公开
    语音活动检测器的自适应语音模式扩展

    公开(公告)号:WO2006104576A3

    公开(公告)日:2007-07-19

    申请号:PCT/US2006004687

    申请日:2006-01-26

    CPC classification number: G10L25/78 G10L2025/786

    Abstract: There is provided a voice activity detection method for indicating an active voice mode and an inactive voice mode. The method comprises receiving a first portion of an input signal; determining that the first portion of the input signal includes an active voice signal; indicating the active voice mode in response to the determining that the first portion of the input signal includes the active voice signal; receiving a second portion of the input signal immediately following the first portion of the input signal; detepnining that the second portion of the input signal includes an inactive voice signal; extending the indicating the active voice mode for a period of time after determining that the second portion of the input signal includes the inactive voice signal, wherein the period of time varies based on one or more conditions; and indicating the inactive voice mode after expiration of the period of time.

    Abstract translation: 提供了一种用于指示主动语音模式和无效语音模式的语音活动检测方法。 该方法包括接收输入信号的第一部分; 确定输入信号的第一部分包括有效语音信号; 响应于确定输入信号的第一部分包括有效语音信号,指示主动语音模式; 接收紧接在输入信号的第一部分之后的输入信号的第二部分; 确定输入信号的第二部分包括不活动的语音信号; 在确定所述输入信号的第二部分包括所述不活动语音信号之后,将所述主动语音模式指示一段时间,其中所述时间段基于一个或多个条件而变化; 并且在该时间段期满之后指示不活动的语音模式。

    ADAPTIVE NOISE STATE UPDATE FOR A VOICE ACTIVITY DETECTOR
    6.
    发明申请
    ADAPTIVE NOISE STATE UPDATE FOR A VOICE ACTIVITY DETECTOR 审中-公开
    语音活动检测器的自适应噪声状态更新

    公开(公告)号:WO2006104555A3

    公开(公告)日:2007-06-28

    申请号:PCT/US2006003155

    申请日:2006-01-26

    CPC classification number: G10L25/78 G10L2025/786

    Abstract: There is provided a method of updating a noise state of a voice activity detection (VAD) for indicating an active voice mode and an inactive voice mode. The method comprises receiving an input signal having a plurality of frames, determining an elapsed time sinc the last update of the noise state, updating the noise state of the VAD if the elapsed time exceeds a predetermined time, determining an average minimum energy based on two or more of the plurality of frames, determining a current minimum energy based on a current frame of the plurality of frames, updating the noise state of the VAD if the average minimum energy is less than the current minimum energy, and updating the noise state of the VAD if the average minimum energy is greater than the current minimum ener plus a first predetermined value (Figure 7).

    Abstract translation: 提供了一种更新用于指示主动语音模式和无效语音模式的语音活动检测(VAD)的噪声状态的方法。 该方法包括接收具有多个帧的输入信号,确定噪声状态的最后更新的经过时间sinc,如果经过时间超过预定时间,则更新VAD的噪声状态,基于两个确定平均最小能量 或更多个帧,基于多个帧的当前帧确定当前最小能量,如果平均最小能量小于当前最小能量,则更新VAD的噪声状态,并且更新噪声状态 如果平均最小能量大于当前最小能量加上第一预定值,则VAD(图7)。

    MUSIC DETECTION WITH LOW-COMPLEXITY PITCH CORRELATION ALGORITHM
    7.
    发明申请
    MUSIC DETECTION WITH LOW-COMPLEXITY PITCH CORRELATION ALGORITHM 审中-公开
    低复杂间距相关算法的音乐检测

    公开(公告)号:WO2006019555B1

    公开(公告)日:2006-09-21

    申请号:PCT/US2005023712

    申请日:2005-06-30

    Inventor: GAO YANG

    CPC classification number: G10L25/90 G10H2210/046 G10H2210/066 G10L25/78

    Abstract: A method for detecting music in a speech signal having a plurality of frames (120). The method comprises obtaining one or more first pitch correlation candidates from a first frame of the plurality of frames (771); obtaining one or more second pitch correlation candidates from a second from of the plurality of frames (771); selecting a pitch correlation (RP) from the one or more first pitch correlation candidates and one or more second pitch correlation candidates (773); and distinguishing music from background noise based on analyzing the pitch correlation (Rp) (775). The method may comprise filtering the speech signal using a one-order low-pass filter prior to the obtaining the one or more first pitch correlation candidates (920), and down sampling the speech signal by four prior to obtaining the one or more first pitch correlation candidates (940).

    Abstract translation: 一种用于检测具有多个帧的语音信号中的音乐的方法(120)。 该方法包括从多个帧中的第一帧获得一个或多个第一基音相关候选(771); 从所述多个帧中的第二个获得一个或多个第二基音相关候选(771); 从一个或多个第一音调相关候选和一个或多个第二音调相关候选中选择音调相关(RP)(773); 并基于分析音调相关性(RP)(775)来区分音乐和背景噪音。 该方法可以包括在获得一个或多个第一音调相关候选(920)之前使用一阶低通滤波器对语音信号进行滤波,并且在获得一个或多个第一音调之前对语音信号进行四次下采样 相关候选者(940)。

    LOW-COMPLEXITY MUSIC DETECTION ALGORITHM AND SYSTEM
    8.
    发明申请
    LOW-COMPLEXITY MUSIC DETECTION ALGORITHM AND SYSTEM 审中-公开
    低复杂度音乐检测算法和系统

    公开(公告)号:WO2006019556A2

    公开(公告)日:2006-02-23

    申请号:PCT/US2005023713

    申请日:2005-06-30

    Inventor: GAO YANG

    CPC classification number: G10L25/48 G10H2210/046 G10L25/78

    Abstract: A method for detecting music in a speech signal having a plurality of frames. The method comprises defining a music threshold value for a first parameter extracted from a frame of the speech signal, defining a background noise threshold value for the first parameter, and defining an unsure threshold value for the first parameter. The unsure threshold value falls between the music threshold value and the background noise threshold value. If the first parameter falls between the music threshold value and the background noise threshold value, the speech signal is classified as music or background noise based on analyzing a plurality of first parameters extracted from the plurality of frames.

    Abstract translation: 一种用于在具有多个帧的语音信号中检测音乐的方法。 该方法包括为从语音信号的帧提取的第一参数定义音乐阈值,为第一参数定义背景噪声阈值,以及为第一参数定义不确定阈值。 不确定阈值落在音乐阈值和背景噪声阈值之间。 如果第一参数落在音乐阈值和背景噪声阈值之间,则基于从多个帧中提取的多个第一参数的分析,将语音信号分类为音乐或背景噪声。

    MUSIC DETECTION WITH LOW-COMPLEXITY PITCH CORRELATION ALGORITHM
    9.
    发明申请
    MUSIC DETECTION WITH LOW-COMPLEXITY PITCH CORRELATION ALGORITHM 审中-公开
    低复杂度相关算法的音乐检测

    公开(公告)号:WO2006019555A2

    公开(公告)日:2006-02-23

    申请号:PCT/US2005023712

    申请日:2005-06-30

    Inventor: GAO YANG

    CPC classification number: G10L25/90 G10H2210/046 G10H2210/066 G10L25/78

    Abstract: A method is provided for detecting music in a speech signal having a plurality of frames. The method comprises obtaining one or more first pitch correlation candidates from a first frame of the plurality of frames; obtaining one or more second pitch correlation candidates from a second frame of the plurality of frames; selecting a pitch correlation (Rp) from the one or more first pitch correlation candidates and the one or more second pitch correlation candidates; and distinguishing music from background noise based on analyzing the pitch correlation (Rp). The method may further comprise filtering the speech signal using a one-order low-pass filter prior to the obtaining the one or more first pitch correlation candidates, and down sampling the speech signal by four prior to the obtaining the one or more first pitch correlation candidates.

    Abstract translation: 提供一种用于检测具有多个帧的语音信号中的音乐的方法。 该方法包括从多个帧的第一帧获得一个或多个第一音调相关候选; 从所述多个帧的第二帧获得一个或多个第二音调相关候选; 从所述一个或多个第一音调相关候选和所述一个或多个第二音调相关候选中选择音调相关(Rp); 并根据分析音调相关(Rp)区分音乐与背景噪声。 该方法还可以包括在获得一个或多个第一音调相关候选之前使用一阶低通滤波器对语音信号进行滤波,以及在获得一个或多个第一音调相关性之前对语音信号进行四次采样 候选人。

    SIMPLE NOISE SUPPRESSION MODEL
    10.
    发明申请
    SIMPLE NOISE SUPPRESSION MODEL 审中-公开
    简单的噪声抑制模型

    公开(公告)号:WO2004084181B1

    公开(公告)日:2005-01-20

    申请号:PCT/US2004007583

    申请日:2004-03-11

    Inventor: GAO YANG

    Abstract: An approach for efficiently reducing background noise from speech signal in real-time applications is presented. A noisy input speech signal is processed through an inverse filter (306) when the spectrum tilt (302) of the input signal is not that of a pure background noise model the noisy input signal is also filtered in order to reduce the spectrum valley areas of the noisy input signal when the background noise is present.

    Abstract translation: 提出了一种在实时应用中有效降低语音信号背景噪声的方法。 当输入信号的频谱倾斜(302)不是纯背景噪声模型的频谱倾斜(302)时,噪声输入语音信号通过逆滤波器(306)被处理,噪声输入信号也被滤波,以便减少频谱谷值区域 当背景噪声存在时的噪声输入信号。

Patent Agency Ranking