High resolution audio coding for improving package loss concealment

    公开(公告)号:US11749290B2

    公开(公告)日:2023-09-05

    申请号:US17373148

    申请日:2021-07-12

    发明人: Yang Gao

    摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing long-term prediction (LTP) are described. One example of the methods includes determining a pitch gain and a pitch lag of an input audio signal for at least a predetermined number of frames. It is determined that the pitch gain of the input audio signal has exceeded a predetermined threshold and that a change of the pitch lag of the input audio signal has been within a predetermined range for at least the predetermined number of frames. In response to determining that the pitch gain of the input audio signal has exceeded the predetermined threshold and that the change of the third pitch lag has been within the predetermined range for at least the predetermined number of frames, a pitch gain is set for a current frame of the input audio signal.

    Audio classification based on perceptual quality for low or medium bit rates

    公开(公告)号:US10283133B2

    公开(公告)日:2019-05-07

    申请号:US15398321

    申请日:2017-01-04

    发明人: Yang Gao

    摘要: The quality of encoded signals can be improved by reclassifying AUDIO signals carrying non-speech data as VOICE signals when periodicity parameters of the signal satisfy one or more criteria. In some embodiments, only low or medium bit rate signals are considered for re-classification. The periodicity parameters can include any characteristic or set of characteristics indicative of periodicity. For example, the periodicity parameter may include pitch differences between subframes in the audio signal, a normalized pitch correlation for one or more subframes, an average normalized pitch correlation for the audio signal, or combinations thereof. Audio signals which are re-classified as VOICED signals may be encoded in the time-domain, while audio signals that remain classified as AUDIO signals may be encoded in the frequency-domain.

    VERY SHORT PITCH DETECTION AND CODING
    4.
    发明申请

    公开(公告)号:US20170323652A1

    公开(公告)日:2017-11-09

    申请号:US15662302

    申请日:2017-07-28

    发明人: Yang Gao Fengyan Qi

    摘要: System and method embodiments are provided for very short pitch detection and coding for speech or audio signals. The system and method include detecting whether there is a very short pitch lag in a speech or audio signal that is shorter than a conventional minimum pitch limitation using a combination of time domain and frequency domain pitch detection techniques. The pitch detection techniques include using pitch correlations in time domain and detecting a lack of low frequency energy in the speech or audio signal in frequency domain. The detected very short pitch lag is coded using a pitch range from a predetermined minimum very short pitch limitation that is smaller than the conventional minimum pitch limitation.

    Very short pitch detection and coding

    公开(公告)号:US09741357B2

    公开(公告)日:2017-08-22

    申请号:US14744452

    申请日:2015-06-19

    发明人: Yang Gao Fengyan Qi

    摘要: System and method embodiments are provided for very short pitch detection and coding for speech or audio signals. The system and method include detecting whether there is a very short pitch lag in a speech or audio signal that is shorter than a conventional minimum pitch limitation using a combination of time domain and frequency domain pitch detection techniques. The pitch detection techniques include using pitch correlations in time domain and detecting a lack of low frequency energy in the speech or audio signal in frequency domain. The detected very short pitch lag is coded using a pitch range from a predetermined minimum very short pitch limitation that is smaller than the conventional minimum pitch limitation.

    Bandwidth Extension System and Approach
    6.
    发明申请
    Bandwidth Extension System and Approach 审中-公开
    带宽扩展系统和方法

    公开(公告)号:US20160372124A1

    公开(公告)日:2016-12-22

    申请号:US15256182

    申请日:2016-09-02

    发明人: Yang Gao

    摘要: A method of performing BandWidth Extension (BWE) includes a frequency band shifting approach to generate an extended high band signal in time domain and a gain determination approach of controlling the energy of the extended high band. The proposed approach allows shifting any size of low band to any size of high band. The BWE scaling gain is estimated by using available filter bank coefficients with extremely low bit rate or without costing any bit, combining three possible gain factors.

    摘要翻译: 执行带宽扩展(BWE)的方法包括:在时域中生成扩展高频带信号的频带移位方法以及控制扩展高频带的能量的增益确定方法。 所提出的方法允许将任何大小的低频带移动到任何大小的高频段。 通过使用具有极低比特率的可用滤波器组系数或者不考虑任何比特来组合三个可能的增益因子来估计BWE缩放增益。

    Spectrum Flatness Control for Bandwidth Extension
    7.
    发明申请
    Spectrum Flatness Control for Bandwidth Extension 审中-公开
    带宽扩展的频谱平坦度控制

    公开(公告)号:US20150255073A1

    公开(公告)日:2015-09-10

    申请号:US14719693

    申请日:2015-05-22

    发明人: Yang Gao

    IPC分类号: G10L19/002 G10L19/022

    摘要: In accordance with an embodiment, a method of decoding an encoded audio bitstream at a decoder includes receiving the audio bitstream, decoding a low band bitstream of the audio bitstream to get low band coefficients in a frequency domain, and copying a plurality of the low band coefficients to a high frequency band location to generate high band coefficients. The method further includes processing the high band coefficients to form processed high band coefficients. Processing includes modifying an energy envelope of the high band coefficients by multiplying modification gains to flatten or smooth the high band coefficients, and applying a received spectral envelope decoded from the received audio bitstream to the high band coefficients. The low band coefficients and the processed high band coefficients are then inverse-transformed to the time domain to obtain a time domain output signal.

    摘要翻译: 根据实施例,在解码器处解码编码音频比特流的方法包括接收音频比特流,解码音频比特流的低频带比特流以获得频域中的低频带系数,以及复制多个低频带 系数到高频带位置以产生高频带系数。 该方法还包括处理高频带系数以形成经处理的高频带系数。 处理包括通过将修改增益相乘以对高频带系数进行平坦化或平滑来修改高频带系数的能量包络,以及将从接收到的音频比特流解码的接收频谱包络应用于高频带系数。 然后将低频带系数和经处理的高频带系数逆变换到时域以获得时域输出信号。

    Adaptive Bandwidth Extension and Apparatus for the Same
    8.
    发明申请
    Adaptive Bandwidth Extension and Apparatus for the Same 有权
    适用带宽扩展及其设备

    公开(公告)号:US20150073784A1

    公开(公告)日:2015-03-12

    申请号:US14478839

    申请日:2014-09-05

    发明人: Yang Gao

    IPC分类号: G10L19/12

    摘要: In one embodiment of the present invention, a method of decoding an encoded audio bitstream and generating frequency bandwidth extension includes decoding the audio bitstream to produce a decoded low band audio signal and generate a low band excitation spectrum corresponding to a low frequency band. A sub-band area is selected from within the low frequency band using a parameter which indicates energy information of a spectral envelope of the decoded low band audio signal. A high band excitation spectrum is generated for a high frequency band by copying a sub-band excitation spectrum from the selected sub-band area to a high sub-band area corresponding to the high frequency band. Using the generated high band excitation spectrum, an extended high band audio signal is generated by applying a high band spectral envelope. The extended high band audio signal is added to the decoded low band audio signal to generate an audio output signal having an extended frequency bandwidth.

    摘要翻译: 在本发明的一个实施例中,解码编码音频比特流并产生频率带宽扩展的方法包括对音频比特流进行解码以产生解码的低频带音频信号并产生对应于低频带的低频激励频谱。 使用指示解码的低频带音频信号的频谱包络的​​能量信息的参数从低频带内选择子带区域。 通过将子带激励频谱从所选择的子带区域复制到对应于高频带的高子带区域,为高频带生成高频带激励频谱。 使用所产生的高频带激励频谱,通过应用高频带频谱包络来产生扩展的高频带音频信号。 将扩展的高频带音频信号添加到解码的低频带音频信号以产生具有扩展的频率带宽的音频输出信号。

    Efficient temporal envelope coding approach by prediction between low band signal and high band signal
    9.
    发明授权
    Efficient temporal envelope coding approach by prediction between low band signal and high band signal 有权
    低频信号与高频信号之间的预测有效的时间包络编码方法

    公开(公告)号:US08942988B2

    公开(公告)日:2015-01-27

    申请号:US13625874

    申请日:2012-09-25

    发明人: Yang Gao

    摘要: This invention provides a more efficient way to quantize temporal envelope shaping of high band signal by benefiting from energy relationship between low band signal and high band signal; if low band signal is well coded or it is coded with time domain codec such as CELP, temporal envelope shaping information of low band signal can be used to predict temporal envelope shaping of high band signal; the temporal envelope shaping prediction can bring significant saving of bits to precisely quantize temporal envelope shaping of high band signal. This prediction approach can be combined with other specific approach to further increase the efficiency and save mores bits.

    摘要翻译: 本发明通过受益于低频带信号与高频带信号之间的能量关系,提供了一种更高效的量化高频带信号时间包络整形的方法; 如果低频带信号被良好编码,或者用诸如CELP的时域编解码器编码,则可以使用低频带信号的时间包络整形信息来预测高频带信号的时间包络整形; 时间包络整形预测可以显着节省位以精确量化高频带信号的时间包络整形。 这种预测方法可以与其他具体方法结合起来,以进一步提高效率并节省毛利位。

    System and Method for Audio Coding and Decoding
    10.
    发明申请
    System and Method for Audio Coding and Decoding 审中-公开
    用于音频编码和解码的系统和方法

    公开(公告)号:US20150025897A1

    公开(公告)日:2015-01-22

    申请号:US14509737

    申请日:2014-10-08

    IPC分类号: G10L19/00

    CPC分类号: G10L19/00 G10L19/26 G10L25/18

    摘要: In accordance with an embodiment, a method of generating an encoded audio signal, the method includes estimating a time-frequency energy of an input audio signal from a time-frequency filter bank, computing a global variance of the time-frequency energy, determining a post-processing method according to the global variance, and transmitting an encoded representation of the input audio signal along with an indication of the determined post-processing method.

    摘要翻译: 根据实施例,一种生成编码音频信号的方法,所述方法包括估计来自时频滤波器组的输入音频信号的时频能量,计算时频能量的全局方差,确定 根据全局方差的后处理方法,以及输出音频信号的编码表示以及所确定的后处理方法的指示。