Patent search ap:("Huawei Technologies Co. Page Ltd.") AND inv:"Yang Gao"

61.

发明申请
Adaptively Encoding Pitch Lag For Voiced Speech 有权
Title translation: 自适应编码语音延迟用于语音

公开(公告)号：US20130166287A1

公开(公告)日：2013-06-27

申请号：US13724700

申请日：2012-12-21

Applicant: Huawei Technologies Co., Ltd.

Inventor： Yang Gao

IPC: G10L11/04

CPC classification number: G10L25/90 , G10L19/09 , G10L19/18

Abstract: System and method embodiments for dual modes pitch coding are provided. The system and method embodiments are configured to adaptively code pitch lags of a voiced speech signal using one of two pitch coding modes according to a pitch length, stability, or both. The two pitch coding modes include a first pitch coding mode with relatively high precision and reduced dynamic range, and a second pitch coding mode with relatively large dynamic range and reduced precision. The first pitch coding mode is used upon determining that the voiced speech signal has a relatively short or substantially stable pitch. The second pitch coding mode is used upon determining that the voiced speech signal has a relatively long or less stable pitch or is a substantially noisy signal.

Abstract translation: 提供了用于双模音调编码的系统和方法实施例。系统和方法实施例被配置为根据间距长度，稳定性或两者来使用两种音调编码模式之一自适应地编码有声语音信号的音调滞后。两个音调编码模式包括具有相对较高精度和降低的动态范围的第一音调编码模式，以及具有相对大的动态范围和精度降低的第二音调编码模式。在确定有声语音信号具有相对较短或基本上稳定的音调时，使用第一音调编码模式。第二音调编码模式在确定有声语音信号具有相对较长或较小的稳定音调或者是基本上噪声的信号时被使用。

62.

发明授权
High resolution audio coding 有权

公开(公告)号：US11735193B2

公开(公告)日：2023-08-22

申请号：US17372849

申请日：2021-07-12

Applicant: HUAWEI TECHNOLOGIES CO., LTD.

Inventor： Yang Gao

IPC: G10L19/032

CPC classification number: G10L19/032

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing residual quantization are described. One example of the methods includes performing a first residual quantization on a first target residual signal at a first bit rate to generate a first quantized residual signal. A second target residual signal is generated based at least on the first quantized residual signal and the first target residual signal. A second residual quantization is performed on the second target residual signal at a second bit rate to generate a second quantized residual signal, where the first bit rate is different from the second bit rate.

63.

发明授权
Classification between time-domain coding and frequency domain coding for high bit rates 有权

公开(公告)号：US10885926B2

公开(公告)日：2021-01-05

申请号：US16749755

申请日：2020-01-22

Applicant: Huawei Technologies Co., Ltd.

Inventor： Yang Gao

IPC: G10L19/125 , G10L19/22 , G10L19/002 , G10L19/00

Abstract: A method for processing speech signals prior to encoding a digital signal comprising audio data includes selecting frequency domain coding or time domain coding based on a coding bit rate to be used for coding the digital signal and a short pitch lag detection of the digital signal.

64.

发明申请
Classification Between Time-Domain Coding and Frequency Domain Coding for High Bit Rates 审中-公开

公开(公告)号：US20200234724A1

公开(公告)日：2020-07-23

申请号：US16749755

申请日：2020-01-22

Applicant: Huawei Technologies Co., Ltd.

Inventor： Yang Gao

IPC: G10L19/125 , G10L19/22 , G10L19/002

Abstract: A method for processing speech signals prior to encoding a digital signal comprising audio data includes selecting frequency domain coding or time domain coding based on a coding bit rate to be used for coding the digital signal and a short pitch lag detection of the digital signal.

65.

发明申请
Unvoiced Voiced Decision For Speech Processing Cross Reference To Related Applications 审中-公开

公开(公告)号：US20200005812A1

公开(公告)日：2020-01-02

申请号：US16506357

申请日：2019-07-09

Applicant: HUAWEI TECHNOLOGIES CO., LTD.

Inventor： Yang Gao

IPC: G10L25/78 , G10L25/93 , G10L19/22

Abstract: Method and apparatus for speech processing are disclosed. A first unvoicing parameter for a first frame of a speech signal is determined, and furthered smoothed based on a second unvoicing parameter for a second frame prior to the first frame. A difference between the first unvoicing parameter and the smoothed unvoicing parameter for the first subframe is computed and a unvoiced/voiced classification of the first frame is determined using the computed difference as a decision parameter. Further processing, such as Bandwidth extension (BWE) is performed on based on the classification of the first frame.

66.

发明申请
Audio Classification Based on Perceptual Quality for Low or Medium Bit Rates 审中-公开

公开(公告)号：US20190237088A1

公开(公告)日：2019-08-01

申请号：US16375583

申请日：2019-04-04

Applicant: Huawei Technologies Co., Ltd.

Inventor： Yang Gao

IPC: G10L19/24 , G10L25/93 , G10L19/20 , G10L19/002

CPC classification number: G10L19/24 , G10L19/002 , G10L19/20 , G10L25/06 , G10L25/90 , G10L25/93 , G10L2025/937

Abstract: The quality of encoded signals can be improved by reclassifying AUDIO signals carrying non-speech data as VOICE signals when periodicity parameters of the signal satisfy one or more criteria. In some embodiments, only low or medium bit rate signals are considered for re-classification. The periodicity parameters can include any characteristic or set of characteristics indicative of periodicity. For example, the periodicity parameter may include pitch differences between subframes in the audio signal, a normalized pitch correlation for one or more subframes, an average normalized pitch correlation for the audio signal, or combinations thereof. Audio signals which are re-classified as VOICED signals may be encoded in the time-domain, while audio signals that remain classified as AUDIO signals may be encoded in the frequency-domain.

67.

发明授权
Bandwidth extension system and approach 有权

公开(公告)号：US10217470B2

公开(公告)日：2019-02-26

申请号：US15256182

申请日：2016-09-02

Applicant: Huawei Technologies Co., Ltd.

Inventor： Yang Gao

IPC: G10L19/02 , G10L21/038 , G10L19/022 , G10L19/028 , G10L19/26

Abstract: A method of performing BandWidth Extension (BWE) includes a frequency band shifting approach to generate an extended high band signal in time domain and a gain determination approach of controlling the energy of the extended high band. The proposed approach allows shifting any size of low band to any size of high band. The BWE scaling gain is estimated by using available filter bank coefficients with extremely low bit rate or without costing any bit, combining three possible gain factors.

68.

发明授权
Classification between time-domain coding and frequency domain coding 有权

公开(公告)号：US09837092B2

公开(公告)日：2017-12-05

申请号：US15592573

申请日：2017-05-11

Applicant: HUAWEI TECHNOLOGIES CO., LTD.

Inventor： Yang Gao

IPC: G10L21/00 , G10L19/125 , G10L19/22 , G10L19/002 , G10L19/00

CPC classification number: G10L19/125 , G10L19/002 , G10L19/22 , G10L2019/0002 , G10L2019/0011 , G10L2019/0016

Abstract: A method for processing speech signals prior to encoding a digital signal comprising audio data includes selecting frequency domain coding or time domain coding based on a coding bit rate to be used for coding the digital signal and a short pitch lag detection of the digital signal.

69.

发明授权
System and method for audio coding and decoding 有权

公开(公告)号：US09646616B2

公开(公告)日：2017-05-09

申请号：US14509737

申请日：2014-10-08

Applicant: Huawei Technologies Co., Ltd.

Inventor： David Virette , Yang Gao , Wei Xiao

IPC: G10L19/26 , G10L25/03 , G10L25/18 , G10L19/00

CPC classification number: G10L19/00 , G10L19/26 , G10L25/18

Abstract: In accordance with an embodiment, a method of generating an encoded audio signal, the method includes estimating a time-frequency energy of an input audio signal from a time-frequency filter bank, computing a global variance of the time-frequency energy, determining a post-processing method according to the global variance, and transmitting an encoded representation of the input audio signal along with an indication of the determined post-processing method.

70.

发明申请
Audio Classification Based on Perceptual Quality for Low or Medium Bit Rates 审中-公开

公开(公告)号：US20170116999A1

公开(公告)日：2017-04-27

申请号：US15398321

申请日：2017-01-04

Applicant: HUAWEI TECHNOLOGIES CO.,LTD.

Inventor： Yang Gao

IPC: G10L19/24 , G10L25/93

CPC classification number: G10L19/24 , G10L19/002 , G10L19/20 , G10L25/06 , G10L25/90 , G10L25/93 , G10L2025/937

Abstract: The quality of encoded signals can be improved by reclassifying AUDIO signals carrying non-speech data as VOICE signals when periodicity parameters of the signal satisfy one or more criteria. In some embodiments, only low or medium bit rate signals are considered for re-classification. The periodicity parameters can include any characteristic or set of characteristics indicative of periodicity. For example, the periodicity parameter may include pitch differences between subframes in the audio signal, a normalized pitch correlation for one or more subframes, an average normalized pitch correlation for the audio signal, or combinations thereof. Audio signals which are re-classified as VOICED signals may be encoded in the time-domain, while audio signals that remain classified as AUDIO signals may be encoded in the frequency-domain.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification