Adaptively Encoding Pitch Lag For Voiced Speech
    61.
    发明申请
    Adaptively Encoding Pitch Lag For Voiced Speech 有权
    自适应编码语音延迟用于语音

    公开(公告)号:US20130166287A1

    公开(公告)日:2013-06-27

    申请号:US13724700

    申请日:2012-12-21

    Inventor: Yang Gao

    CPC classification number: G10L25/90 G10L19/09 G10L19/18

    Abstract: System and method embodiments for dual modes pitch coding are provided. The system and method embodiments are configured to adaptively code pitch lags of a voiced speech signal using one of two pitch coding modes according to a pitch length, stability, or both. The two pitch coding modes include a first pitch coding mode with relatively high precision and reduced dynamic range, and a second pitch coding mode with relatively large dynamic range and reduced precision. The first pitch coding mode is used upon determining that the voiced speech signal has a relatively short or substantially stable pitch. The second pitch coding mode is used upon determining that the voiced speech signal has a relatively long or less stable pitch or is a substantially noisy signal.

    Abstract translation: 提供了用于双模音调编码的系统和方法实施例。 系统和方法实施例被配置为根据间距长度,稳定性或两者来使用两种音调编码模式之一自适应地编码有声语音信号的音调滞后。 两个音调编码模式包括具有相对较高精度和降低的动态范围的第一音调编码模式,以及具有相对大的动态范围和精度降低的第二音调编码模式。 在确定有声语音信号具有相对较短或基本上稳定的音调时,使用第一音调编码模式。 第二音调编码模式在确定有声语音信号具有相对较长或较小的稳定音调或者是基本上噪声的信号时被使用。

    High resolution audio coding
    62.
    发明授权

    公开(公告)号:US11735193B2

    公开(公告)日:2023-08-22

    申请号:US17372849

    申请日:2021-07-12

    Inventor: Yang Gao

    CPC classification number: G10L19/032

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing residual quantization are described. One example of the methods includes performing a first residual quantization on a first target residual signal at a first bit rate to generate a first quantized residual signal. A second target residual signal is generated based at least on the first quantized residual signal and the first target residual signal. A second residual quantization is performed on the second target residual signal at a second bit rate to generate a second quantized residual signal, where the first bit rate is different from the second bit rate.

    Unvoiced Voiced Decision For Speech Processing Cross Reference To Related Applications

    公开(公告)号:US20200005812A1

    公开(公告)日:2020-01-02

    申请号:US16506357

    申请日:2019-07-09

    Inventor: Yang Gao

    Abstract: Method and apparatus for speech processing are disclosed. A first unvoicing parameter for a first frame of a speech signal is determined, and furthered smoothed based on a second unvoicing parameter for a second frame prior to the first frame. A difference between the first unvoicing parameter and the smoothed unvoicing parameter for the first subframe is computed and a unvoiced/voiced classification of the first frame is determined using the computed difference as a decision parameter. Further processing, such as Bandwidth extension (BWE) is performed on based on the classification of the first frame.

    Audio Classification Based on Perceptual Quality for Low or Medium Bit Rates

    公开(公告)号:US20190237088A1

    公开(公告)日:2019-08-01

    申请号:US16375583

    申请日:2019-04-04

    Inventor: Yang Gao

    Abstract: The quality of encoded signals can be improved by reclassifying AUDIO signals carrying non-speech data as VOICE signals when periodicity parameters of the signal satisfy one or more criteria. In some embodiments, only low or medium bit rate signals are considered for re-classification. The periodicity parameters can include any characteristic or set of characteristics indicative of periodicity. For example, the periodicity parameter may include pitch differences between subframes in the audio signal, a normalized pitch correlation for one or more subframes, an average normalized pitch correlation for the audio signal, or combinations thereof. Audio signals which are re-classified as VOICED signals may be encoded in the time-domain, while audio signals that remain classified as AUDIO signals may be encoded in the frequency-domain.

    Bandwidth extension system and approach

    公开(公告)号:US10217470B2

    公开(公告)日:2019-02-26

    申请号:US15256182

    申请日:2016-09-02

    Inventor: Yang Gao

    Abstract: A method of performing BandWidth Extension (BWE) includes a frequency band shifting approach to generate an extended high band signal in time domain and a gain determination approach of controlling the energy of the extended high band. The proposed approach allows shifting any size of low band to any size of high band. The BWE scaling gain is estimated by using available filter bank coefficients with extremely low bit rate or without costing any bit, combining three possible gain factors.

    Audio Classification Based on Perceptual Quality for Low or Medium Bit Rates

    公开(公告)号:US20170116999A1

    公开(公告)日:2017-04-27

    申请号:US15398321

    申请日:2017-01-04

    Inventor: Yang Gao

    Abstract: The quality of encoded signals can be improved by reclassifying AUDIO signals carrying non-speech data as VOICE signals when periodicity parameters of the signal satisfy one or more criteria. In some embodiments, only low or medium bit rate signals are considered for re-classification. The periodicity parameters can include any characteristic or set of characteristics indicative of periodicity. For example, the periodicity parameter may include pitch differences between subframes in the audio signal, a normalized pitch correlation for one or more subframes, an average normalized pitch correlation for the audio signal, or combinations thereof. Audio signals which are re-classified as VOICED signals may be encoded in the time-domain, while audio signals that remain classified as AUDIO signals may be encoded in the frequency-domain.

Patent Agency Ranking