System and Method for Audio Coding and Decoding
    11.
    发明申请
    System and Method for Audio Coding and Decoding 审中-公开
    用于音频编码和解码的系统和方法

    公开(公告)号:US20150025897A1

    公开(公告)日:2015-01-22

    申请号:US14509737

    申请日:2014-10-08

    CPC classification number: G10L19/00 G10L19/26 G10L25/18

    Abstract: In accordance with an embodiment, a method of generating an encoded audio signal, the method includes estimating a time-frequency energy of an input audio signal from a time-frequency filter bank, computing a global variance of the time-frequency energy, determining a post-processing method according to the global variance, and transmitting an encoded representation of the input audio signal along with an indication of the determined post-processing method.

    Abstract translation: 根据实施例,一种生成编码音频信号的方法,所述方法包括估计来自时频滤波器组的输入音频信号的时频能量,计算时频能量的全局方差,确定 根据全局方差的后处理方法,以及输出音频信号的编码表示以及所确定的后处理方法的指示。

    Adding second enhancement layer to CELP based core layer
    12.
    发明授权
    Adding second enhancement layer to CELP based core layer 有权
    将第二个增强层添加到基于CELP的核心层

    公开(公告)号:US08775169B2

    公开(公告)日:2014-07-08

    申请号:US13725353

    申请日:2012-12-21

    Inventor: Yang Gao

    CPC classification number: G10L19/04 G10L19/24

    Abstract: In an embodiment, a method of transmitting an input audio signal is disclosed. A first coding error of the input audio signal with a scalable codec having a first enhancement layer is encoded, and a second coding error is encoded using a second enhancement layer after the first enhancement layer. Encoding the second coding error includes coding fine spectrum coefficients of the second coding error to produce coded fine spectrum coefficients, and coding a spectral envelope of the second coding error to produce a coded spectral envelope. The coded fine spectrum coefficients and the coded spectral envelope are transmitted.

    Abstract translation: 在一个实施例中,公开了一种发送输入音频信号的方法。 编码具有具有第一增强层的可扩展编解码器的输入音频信号的第一编码错误,并且在第一增强层之后使用第二增强层对第二编码错误进行编码。 编码第二编码误差包括对第二编码误差的精细频谱系数进行编码以产生编码的精细频谱系数,以及对第二编码误差的频谱包络进行编码以产生经编码的频谱包络。 发送编码的精细频谱系数和编码的频谱包络。

    Unvoiced voiced decision for speech processing cross reference to related applications

    公开(公告)号:US11328739B2

    公开(公告)日:2022-05-10

    申请号:US16506357

    申请日:2019-07-09

    Inventor: Yang Gao

    Abstract: Method and apparatus for speech processing are disclosed. A first unvoicing parameter for a first frame of a speech signal is determined, and furthered smoothed based on a second unvoicing parameter for a second frame prior to the first frame. A difference between the first unvoicing parameter and the smoothed unvoicing parameter for the first subframe is computed and a unvoiced/voiced classification of the first frame is determined using the computed difference as a decision parameter. Further processing, such as Bandwidth extension (BWE) is performed on based on the classification of the first frame.

    Very short pitch detection and coding

    公开(公告)号:US11270716B2

    公开(公告)日:2022-03-08

    申请号:US16668956

    申请日:2019-10-30

    Inventor: Yang Gao Fengyan Qi

    Abstract: A system and method are provided for very short pitch detection and coding for speech or audio signals. The system and method include detecting whether there is a very short pitch lag in a speech or audio signal that is shorter than a conventional minimum pitch limitation using a combination of time domain and frequency domain pitch detection techniques. The pitch detection techniques include using pitch correlations in time domain and detecting a lack of low frequency energy in the speech or audio signal in frequency domain. The detected very short pitch lag is coded using a pitch range from a predetermined minimum very short pitch limitation.

    Very short pitch detection and coding

    公开(公告)号:US10482892B2

    公开(公告)日:2019-11-19

    申请号:US15662302

    申请日:2017-07-28

    Inventor: Yang Gao Fengyan Qi

    Abstract: System and method embodiments are provided for very short pitch detection and coding for speech or audio signals. The system and method include detecting whether there is a very short pitch lag in a speech or audio signal that is shorter than a conventional minimum pitch limitation using a combination of time domain and frequency domain pitch detection techniques. The pitch detection techniques include using pitch correlations in time domain and detecting a lack of low frequency energy in the speech or audio signal in frequency domain. The detected very short pitch lag is coded using a pitch range from a predetermined minimum very short pitch limitation that is smaller than the conventional minimum pitch limitation.

    Unvoiced/voiced decision for speech processing

    公开(公告)号:US10347275B2

    公开(公告)日:2019-07-09

    申请号:US16040225

    申请日:2018-07-19

    Inventor: Yang Gao

    Abstract: A method for speech processing includes determining a first unvoicing parameter for a first subframe of a speech signal, and determining a smoothed unvoicing parameter for the first subframe according to a second unvoicing parameter of a second subframe prior to the first subframe of the speech signal. The first unvoicing parameter is determined according to a periodicity parameter and a spectral tilt parameter. The method further includes computing a difference between the first unvoicing parameter for the first subframe and the smoothed unvoicing parameter for the first subframe and determining a classification of the first subframe using the computed difference as a decision parameter. The classification indicates whether the first subframe is an unvoiced speech signal or not an unvoiced speech signal. Bandwidth extension is performed on the speech signal for the first subframe according to the classification of the first subframe.

    Spectrum flatness control for bandwidth extension

    公开(公告)号:US10339938B2

    公开(公告)日:2019-07-02

    申请号:US14719693

    申请日:2015-05-22

    Inventor: Yang Gao

    Abstract: In accordance with an embodiment, a method of decoding an encoded audio bitstream at a decoder includes receiving the audio bitstream, decoding a low band bitstream of the audio bitstream to get low band coefficients in a frequency domain, and copying a plurality of the low band coefficients to a high frequency band location to generate high band coefficients. The method further includes processing the high band coefficients to form processed high band coefficients. Processing includes modifying an energy envelope of the high band coefficients by multiplying modification gains to flatten or smooth the high band coefficients, and applying a received spectral envelope decoded from the received audio bitstream to the high band coefficients. The low band coefficients and the processed high band coefficients are then inverse-transformed to the time domain to obtain a time domain output signal.

    System and method for mixed codebook excitation for speech coding

    公开(公告)号:US09972325B2

    公开(公告)日:2018-05-15

    申请号:US13768814

    申请日:2013-02-15

    Inventor: Yang Gao

    CPC classification number: G10L19/00 G10L19/12

    Abstract: In accordance with an embodiment, a method of encoding an audio/speech signal includes determining a mixed codebook vector based on an incoming audio/speech signal, where the mixed codebook vector includes a sum of a first codebook entry from a first codebook and a second codebook entry from a second codebook. The method further includes generating an encoded audio signal based on the determined mixed codebook vector, and transmitting a coded excitation index of the determined mixed codebook vector.

Patent Agency Ranking