Unvoiced/Voiced Decision for Speech Processing

    公开(公告)号:US20170110145A1

    公开(公告)日:2017-04-20

    申请号:US15391247

    申请日:2016-12-27

    Inventor: Yang Gao

    CPC classification number: G10L25/78 G10L19/22 G10L25/93

    Abstract: A method for speech processing includes determining an unvoicing parameter for a first frame of a speech signal and determining a smoothed unvoicing parameter for the first frame by weighting the unvoicing parameter of the first frame and a smoothed unvoicing parameter of a second frame. The unvoicing parameter reflects a speech characteristic of the first frame. The smoothed unvoicing parameter of the second frame is weighted less heavily when the smoothed unvoicing parameter of the second frame is greater than the unvoicing parameter of the first frame. The method further includes computing a difference, by a processor, between the unvoicing parameter of the first frame and the smoothed unvoicing parameter of the first frame, and determining a classification of the first frame according to the computed difference. The classification includes unvoiced speech or voiced speech. The first frame is processed in accordance with the classification of the first frame.

    Unvoiced/Voiced Decision for Speech Processing
    73.
    发明申请
    Unvoiced/Voiced Decision for Speech Processing 有权
    用于语音处理的清音/声音决定

    公开(公告)号:US20150073783A1

    公开(公告)日:2015-03-12

    申请号:US14476547

    申请日:2014-09-03

    Inventor: Yang Gao

    CPC classification number: G10L25/78 G10L19/22 G10L25/93

    Abstract: In accordance with an embodiment of the present invention, a method for speech processing includes determining an unvoicing/voicing parameter reflecting a characteristic of unvoiced/voicing speech in a current frame of a speech signal comprising a plurality of frames. A smoothed unvoicing/voicing parameter is determined to include information of the unvoicing/voicing parameter in a frame prior to the current frame of the speech signal. A difference between the unvoicing/voicing parameter and the smoothed unvoicing/voicing parameter is computed. The method further includes generating an unvoiced/voiced decision point for determining whether the current frame comprises unvoiced speech or voiced speech using the computed difference as a decision parameter.

    Abstract translation: 根据本发明的实施例,一种用于语音处理的方法包括:确定反映在包括多个帧的语音信号的当前帧中的清音/发声语音的特征的清音/发声参数。 平滑的清音/发声参数被确定为包括语音信号的当前帧之前的帧中的清音/发声参数的信息。 计算出浊音/浊音参数与平滑的浊音/浊音参数之间的差异。 该方法还包括生成清音/有声决定点,用于使用所计算的差分作为判定参数来确定当前帧是否包括无声语音或浊音。

    Audio Classification Based on Perceptual Quality for Low or Medium Bit Rates
    74.
    发明申请
    Audio Classification Based on Perceptual Quality for Low or Medium Bit Rates 有权
    基于低或中等比特率的感知质量的音频分类

    公开(公告)号:US20140081629A1

    公开(公告)日:2014-03-20

    申请号:US14027052

    申请日:2013-09-13

    Inventor: Yang Gao

    Abstract: The quality of encoded signals can be improved by reclassifying AUDIO signals carrying non-speech data as VOICE signals when periodicity parameters of the signal satisfy one or more criteria. In some embodiments, only low or medium bit rate signals are considered for re-classification. The periodicity parameters can include any characteristic or set of characteristics indicative of periodicity. For example, the periodicity parameter may include pitch differences between subframes in the audio signal, a normalized pitch correlation for one or more subframes, an average normalized pitch correlation for the audio signal, or combinations thereof. Audio signals which are re-classified as VOICED signals may be encoded in the time-domain, while audio signals that remain classified as AUDIO signals may be encoded in the frequency-domain.

    Abstract translation: 当信号的周期参数满足一个或多个标准时,通过将携带非语音数据的AUDIO信号重新分类为VOICE信号可以改善编码信号的质量。 在一些实施例中,仅考虑低或中比特率信号用于重新分类。 周期性参数可以包括指示周期性的任何特征或特征集合。 例如,周期性参数可以包括音频信号中的子帧之间的音调差,一个或多个子帧的归一化音调相关性,音频信号的平均归一化音调相关性,或其组合。 重新分类为VOICED信号的音频信号可以在时域中被编码,而保持分类为AUDIO信号的音频信号可以在频域中进行编码。

    Adding Second Enhancement Layer to CELP Based Core Layer
    75.
    发明申请
    Adding Second Enhancement Layer to CELP Based Core Layer 有权
    添加第二增强层到基于CELP的核心层

    公开(公告)号:US20130110507A1

    公开(公告)日:2013-05-02

    申请号:US13725353

    申请日:2012-12-21

    Inventor: Yang Gao

    CPC classification number: G10L19/04 G10L19/24

    Abstract: In an embodiment, a method of transmitting an input audio signal is disclosed. A first coding error of the input audio signal with a scalable codec having a first enhancement layer is encoded, and a second coding error is encoded using a second enhancement layer after the first enhancement layer. Encoding the second coding error includes coding fine spectrum coefficients of the second coding error to produce coded fine spectrum coefficients, and coding a spectral envelope of the second coding error to produce a coded spectral envelope. The coded fine spectrum coefficients and the coded spectral envelope are transmitted.

    Abstract translation: 在一个实施例中,公开了一种发送输入音频信号的方法。 编码具有具有第一增强层的可扩展编解码器的输入音频信号的第一编码错误,并且在第一增强层之后使用第二增强层对第二编码错误进行编码。 编码第二编码误差包括对第二编码误差的精细频谱系数进行编码以产生编码的精细频谱系数,以及对第二编码误差的频谱包络进行编码以产生经编码的频谱包络。 发送编码的精细频谱系数和编码的频谱包络。

Patent Agency Ranking