-
公开(公告)号:US20150025897A1
公开(公告)日:2015-01-22
申请号:US14509737
申请日:2014-10-08
Applicant: Huawei Technologies Co., Ltd.
Inventor: David Virette , Yang Gao , Wei Xiao
IPC: G10L19/00
Abstract: In accordance with an embodiment, a method of generating an encoded audio signal, the method includes estimating a time-frequency energy of an input audio signal from a time-frequency filter bank, computing a global variance of the time-frequency energy, determining a post-processing method according to the global variance, and transmitting an encoded representation of the input audio signal along with an indication of the determined post-processing method.
Abstract translation: 根据实施例,一种生成编码音频信号的方法,所述方法包括估计来自时频滤波器组的输入音频信号的时频能量,计算时频能量的全局方差,确定 根据全局方差的后处理方法,以及输出音频信号的编码表示以及所确定的后处理方法的指示。
-
12.
公开(公告)号:US08775169B2
公开(公告)日:2014-07-08
申请号:US13725353
申请日:2012-12-21
Applicant: Huawei Technologies Co., Ltd.
Inventor: Yang Gao
IPC: G10L19/00
Abstract: In an embodiment, a method of transmitting an input audio signal is disclosed. A first coding error of the input audio signal with a scalable codec having a first enhancement layer is encoded, and a second coding error is encoded using a second enhancement layer after the first enhancement layer. Encoding the second coding error includes coding fine spectrum coefficients of the second coding error to produce coded fine spectrum coefficients, and coding a spectral envelope of the second coding error to produce a coded spectral envelope. The coded fine spectrum coefficients and the coded spectral envelope are transmitted.
Abstract translation: 在一个实施例中,公开了一种发送输入音频信号的方法。 编码具有具有第一增强层的可扩展编解码器的输入音频信号的第一编码错误,并且在第一增强层之后使用第二增强层对第二编码错误进行编码。 编码第二编码误差包括对第二编码误差的精细频谱系数进行编码以产生编码的精细频谱系数,以及对第二编码误差的频谱包络进行编码以产生经编码的频谱包络。 发送编码的精细频谱系数和编码的频谱包络。
-
公开(公告)号:US20240221766A1
公开(公告)日:2024-07-04
申请号:US18400067
申请日:2023-12-29
Applicant: Huawei Technologies Co., Ltd.
Inventor: Yang Gao , Fengyan Qi
Abstract: A method includes detecting whether there is a very short pitch lag in a speech or audio signal that is shorter than a conventional minimum pitch limitation using a combination of time domain and frequency domain pitch detection techniques. The pitch detection techniques include using pitch correlations in a time domain and detecting a lack of low frequency energy in the speech or audio signal in a frequency domain. The detected very short pitch lag is coded using a pitch range from a predetermined minimum very short pitch limitation that is smaller than the conventional minimum pitch limitation.
-
公开(公告)号:US11328739B2
公开(公告)日:2022-05-10
申请号:US16506357
申请日:2019-07-09
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Yang Gao
Abstract: Method and apparatus for speech processing are disclosed. A first unvoicing parameter for a first frame of a speech signal is determined, and furthered smoothed based on a second unvoicing parameter for a second frame prior to the first frame. A difference between the first unvoicing parameter and the smoothed unvoicing parameter for the first subframe is computed and a unvoiced/voiced classification of the first frame is determined using the computed difference as a decision parameter. Further processing, such as Bandwidth extension (BWE) is performed on based on the classification of the first frame.
-
公开(公告)号:US11270716B2
公开(公告)日:2022-03-08
申请号:US16668956
申请日:2019-10-30
Applicant: Huawei Technologies Co., Ltd.
Inventor: Yang Gao , Fengyan Qi
Abstract: A system and method are provided for very short pitch detection and coding for speech or audio signals. The system and method include detecting whether there is a very short pitch lag in a speech or audio signal that is shorter than a conventional minimum pitch limitation using a combination of time domain and frequency domain pitch detection techniques. The pitch detection techniques include using pitch correlations in time domain and detecting a lack of low frequency energy in the speech or audio signal in frequency domain. The detected very short pitch lag is coded using a pitch range from a predetermined minimum very short pitch limitation.
-
公开(公告)号:US10482892B2
公开(公告)日:2019-11-19
申请号:US15662302
申请日:2017-07-28
Applicant: HUAWEI TECHNOLOGIES CO.,LTD.
Inventor: Yang Gao , Fengyan Qi
Abstract: System and method embodiments are provided for very short pitch detection and coding for speech or audio signals. The system and method include detecting whether there is a very short pitch lag in a speech or audio signal that is shorter than a conventional minimum pitch limitation using a combination of time domain and frequency domain pitch detection techniques. The pitch detection techniques include using pitch correlations in time domain and detecting a lack of low frequency energy in the speech or audio signal in frequency domain. The detected very short pitch lag is coded using a pitch range from a predetermined minimum very short pitch limitation that is smaller than the conventional minimum pitch limitation.
-
公开(公告)号:US10347275B2
公开(公告)日:2019-07-09
申请号:US16040225
申请日:2018-07-19
Applicant: Huawei Technologies Co., Ltd.
Inventor: Yang Gao
Abstract: A method for speech processing includes determining a first unvoicing parameter for a first subframe of a speech signal, and determining a smoothed unvoicing parameter for the first subframe according to a second unvoicing parameter of a second subframe prior to the first subframe of the speech signal. The first unvoicing parameter is determined according to a periodicity parameter and a spectral tilt parameter. The method further includes computing a difference between the first unvoicing parameter for the first subframe and the smoothed unvoicing parameter for the first subframe and determining a classification of the first subframe using the computed difference as a decision parameter. The classification indicates whether the first subframe is an unvoiced speech signal or not an unvoiced speech signal. Bandwidth extension is performed on the speech signal for the first subframe according to the classification of the first subframe.
-
公开(公告)号:US10339938B2
公开(公告)日:2019-07-02
申请号:US14719693
申请日:2015-05-22
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Yang Gao
IPC: G10L19/24 , G10L19/26 , G10L25/18 , G10L19/002 , G10L19/022 , G10L21/038 , G10L21/0388
Abstract: In accordance with an embodiment, a method of decoding an encoded audio bitstream at a decoder includes receiving the audio bitstream, decoding a low band bitstream of the audio bitstream to get low band coefficients in a frequency domain, and copying a plurality of the low band coefficients to a high frequency band location to generate high band coefficients. The method further includes processing the high band coefficients to form processed high band coefficients. Processing includes modifying an energy envelope of the high band coefficients by multiplying modification gains to flatten or smooth the high band coefficients, and applying a received spectral envelope decoded from the received audio bitstream to the high band coefficients. The low band coefficients and the processed high band coefficients are then inverse-transformed to the time domain to obtain a time domain output signal.
-
公开(公告)号:US10083698B2
公开(公告)日:2018-09-25
申请号:US15677027
申请日:2017-08-15
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Yang Gao
IPC: G10L19/005 , G10L19/09 , G10L19/083 , G10L19/22
CPC classification number: G10L19/005 , G10L19/083 , G10L19/09 , G10L19/22
Abstract: A speech coding method of reducing error propagation due to voice packet loss, is achieved by limiting or reducing a pitch gain only for the first subframe or the first two subframes within a speech frame, the excitation of a next frame is obtained according to the reduced or limited pitch gain value of the first subframe, and the next frame is encoded according to the obtained excitation. The method is used for a voiced speech class.
-
公开(公告)号:US09972325B2
公开(公告)日:2018-05-15
申请号:US13768814
申请日:2013-02-15
Applicant: Huawei Technologies Co., Ltd.
Inventor: Yang Gao
Abstract: In accordance with an embodiment, a method of encoding an audio/speech signal includes determining a mixed codebook vector based on an incoming audio/speech signal, where the mixed codebook vector includes a sum of a first codebook entry from a first codebook and a second codebook entry from a second codebook. The method further includes generating an encoded audio signal based on the determined mixed codebook vector, and transmitting a coded excitation index of the determined mixed codebook vector.
-
-
-
-
-
-
-
-
-