-
公开(公告)号:US20130166287A1
公开(公告)日:2013-06-27
申请号:US13724700
申请日:2012-12-21
Applicant: Huawei Technologies Co., Ltd.
Inventor: Yang Gao
IPC: G10L11/04
Abstract: System and method embodiments for dual modes pitch coding are provided. The system and method embodiments are configured to adaptively code pitch lags of a voiced speech signal using one of two pitch coding modes according to a pitch length, stability, or both. The two pitch coding modes include a first pitch coding mode with relatively high precision and reduced dynamic range, and a second pitch coding mode with relatively large dynamic range and reduced precision. The first pitch coding mode is used upon determining that the voiced speech signal has a relatively short or substantially stable pitch. The second pitch coding mode is used upon determining that the voiced speech signal has a relatively long or less stable pitch or is a substantially noisy signal.
Abstract translation: 提供了用于双模音调编码的系统和方法实施例。 系统和方法实施例被配置为根据间距长度,稳定性或两者来使用两种音调编码模式之一自适应地编码有声语音信号的音调滞后。 两个音调编码模式包括具有相对较高精度和降低的动态范围的第一音调编码模式,以及具有相对大的动态范围和精度降低的第二音调编码模式。 在确定有声语音信号具有相对较短或基本上稳定的音调时,使用第一音调编码模式。 第二音调编码模式在确定有声语音信号具有相对较长或较小的稳定音调或者是基本上噪声的信号时被使用。
-
公开(公告)号:US11735193B2
公开(公告)日:2023-08-22
申请号:US17372849
申请日:2021-07-12
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Yang Gao
IPC: G10L19/032
CPC classification number: G10L19/032
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing residual quantization are described. One example of the methods includes performing a first residual quantization on a first target residual signal at a first bit rate to generate a first quantized residual signal. A second target residual signal is generated based at least on the first quantized residual signal and the first target residual signal. A second residual quantization is performed on the second target residual signal at a second bit rate to generate a second quantized residual signal, where the first bit rate is different from the second bit rate.
-
63.
公开(公告)号:US10885926B2
公开(公告)日:2021-01-05
申请号:US16749755
申请日:2020-01-22
Applicant: Huawei Technologies Co., Ltd.
Inventor: Yang Gao
IPC: G10L19/125 , G10L19/22 , G10L19/002 , G10L19/00
Abstract: A method for processing speech signals prior to encoding a digital signal comprising audio data includes selecting frequency domain coding or time domain coding based on a coding bit rate to be used for coding the digital signal and a short pitch lag detection of the digital signal.
-
64.
公开(公告)号:US20200234724A1
公开(公告)日:2020-07-23
申请号:US16749755
申请日:2020-01-22
Applicant: Huawei Technologies Co., Ltd.
Inventor: Yang Gao
IPC: G10L19/125 , G10L19/22 , G10L19/002
Abstract: A method for processing speech signals prior to encoding a digital signal comprising audio data includes selecting frequency domain coding or time domain coding based on a coding bit rate to be used for coding the digital signal and a short pitch lag detection of the digital signal.
-
65.
公开(公告)号:US20200005812A1
公开(公告)日:2020-01-02
申请号:US16506357
申请日:2019-07-09
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Yang Gao
Abstract: Method and apparatus for speech processing are disclosed. A first unvoicing parameter for a first frame of a speech signal is determined, and furthered smoothed based on a second unvoicing parameter for a second frame prior to the first frame. A difference between the first unvoicing parameter and the smoothed unvoicing parameter for the first subframe is computed and a unvoiced/voiced classification of the first frame is determined using the computed difference as a decision parameter. Further processing, such as Bandwidth extension (BWE) is performed on based on the classification of the first frame.
-
公开(公告)号:US20190237088A1
公开(公告)日:2019-08-01
申请号:US16375583
申请日:2019-04-04
Applicant: Huawei Technologies Co., Ltd.
Inventor: Yang Gao
IPC: G10L19/24 , G10L25/93 , G10L19/20 , G10L19/002
CPC classification number: G10L19/24 , G10L19/002 , G10L19/20 , G10L25/06 , G10L25/90 , G10L25/93 , G10L2025/937
Abstract: The quality of encoded signals can be improved by reclassifying AUDIO signals carrying non-speech data as VOICE signals when periodicity parameters of the signal satisfy one or more criteria. In some embodiments, only low or medium bit rate signals are considered for re-classification. The periodicity parameters can include any characteristic or set of characteristics indicative of periodicity. For example, the periodicity parameter may include pitch differences between subframes in the audio signal, a normalized pitch correlation for one or more subframes, an average normalized pitch correlation for the audio signal, or combinations thereof. Audio signals which are re-classified as VOICED signals may be encoded in the time-domain, while audio signals that remain classified as AUDIO signals may be encoded in the frequency-domain.
-
公开(公告)号:US10217470B2
公开(公告)日:2019-02-26
申请号:US15256182
申请日:2016-09-02
Applicant: Huawei Technologies Co., Ltd.
Inventor: Yang Gao
IPC: G10L19/02 , G10L21/038 , G10L19/022 , G10L19/028 , G10L19/26
Abstract: A method of performing BandWidth Extension (BWE) includes a frequency band shifting approach to generate an extended high band signal in time domain and a gain determination approach of controlling the energy of the extended high band. The proposed approach allows shifting any size of low band to any size of high band. The BWE scaling gain is estimated by using available filter bank coefficients with extremely low bit rate or without costing any bit, combining three possible gain factors.
-
公开(公告)号:US09837092B2
公开(公告)日:2017-12-05
申请号:US15592573
申请日:2017-05-11
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Yang Gao
IPC: G10L21/00 , G10L19/125 , G10L19/22 , G10L19/002 , G10L19/00
CPC classification number: G10L19/125 , G10L19/002 , G10L19/22 , G10L2019/0002 , G10L2019/0011 , G10L2019/0016
Abstract: A method for processing speech signals prior to encoding a digital signal comprising audio data includes selecting frequency domain coding or time domain coding based on a coding bit rate to be used for coding the digital signal and a short pitch lag detection of the digital signal.
-
公开(公告)号:US09646616B2
公开(公告)日:2017-05-09
申请号:US14509737
申请日:2014-10-08
Applicant: Huawei Technologies Co., Ltd.
Inventor: David Virette , Yang Gao , Wei Xiao
Abstract: In accordance with an embodiment, a method of generating an encoded audio signal, the method includes estimating a time-frequency energy of an input audio signal from a time-frequency filter bank, computing a global variance of the time-frequency energy, determining a post-processing method according to the global variance, and transmitting an encoded representation of the input audio signal along with an indication of the determined post-processing method.
-
公开(公告)号:US20170116999A1
公开(公告)日:2017-04-27
申请号:US15398321
申请日:2017-01-04
Applicant: HUAWEI TECHNOLOGIES CO.,LTD.
Inventor: Yang Gao
CPC classification number: G10L19/24 , G10L19/002 , G10L19/20 , G10L25/06 , G10L25/90 , G10L25/93 , G10L2025/937
Abstract: The quality of encoded signals can be improved by reclassifying AUDIO signals carrying non-speech data as VOICE signals when periodicity parameters of the signal satisfy one or more criteria. In some embodiments, only low or medium bit rate signals are considered for re-classification. The periodicity parameters can include any characteristic or set of characteristics indicative of periodicity. For example, the periodicity parameter may include pitch differences between subframes in the audio signal, a normalized pitch correlation for one or more subframes, an average normalized pitch correlation for the audio signal, or combinations thereof. Audio signals which are re-classified as VOICED signals may be encoded in the time-domain, while audio signals that remain classified as AUDIO signals may be encoded in the frequency-domain.
-
-
-
-
-
-
-
-
-