-
公开(公告)号:US20230046850A1
公开(公告)日:2023-02-16
申请号:US17974851
申请日:2022-10-27
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Xianbo MENG , Bingyin XIA , Zhe WANG
IPC: G10L19/08 , G10L19/008 , G10L19/032
Abstract: A linear prediction coding (LPC) parameter coding method is provided. The method includes: determining a reference LPC parameter from a plurality of LPC parameters, performing direct coding on the reference LPC parameter, and performing reference coding on a non-reference LPC parameter based on the determined LPC parameter. The method includes: obtaining a direct coding result of the reference LPC parameter and determining a residual coding result of the non-reference LPC parameter.
-
公开(公告)号:US20240119950A1
公开(公告)日:2024-04-11
申请号:US18538708
申请日:2023-12-13
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Yuan GAO , Shuai LIU , Bingyin XIA , Bin WANG , Zhe WANG
IPC: G10L19/008 , G10L25/21 , H04S7/00
CPC classification number: G10L19/008 , G10L25/21 , H04S7/30
Abstract: A method for encoding a three-dimensional audio signal is provided. The method includes: An encoder obtains a current frame of a three-dimensional audio signal; obtains coding efficiency of an initial virtual speaker for the current frame based on the current frame of the three-dimensional audio signal; and when the coding efficiency of the initial virtual speaker for the current frame meets a preset condition, determines an updated virtual speaker for the current frame from a set of candidate virtual speakers; encodes the current frame based on the updated virtual speaker for the current frame, to obtain a first bitstream; or when the coding efficiency of the initial virtual speaker for the current frame does not meet the preset condition, encodes the current frame based on the initial virtual speaker for the current frame, to obtain a second bitstream.
-
3.
公开(公告)号:US20240087585A1
公开(公告)日:2024-03-14
申请号:US18515612
申请日:2023-11-21
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Bingyin XIA , Jiawei LI , Zhe WANG
IPC: G10L19/24 , G10L19/16 , G10L25/30 , H04N19/184 , H04N19/91
CPC classification number: G10L19/24 , G10L19/167 , G10L25/30 , H04N19/184 , H04N19/91 , G10L19/173
Abstract: This application disclose an encoding method and apparatus, a decoding method and apparatus, a device, a storage medium, and a computer program, and relate to the field of encoding and decoding technologies. In this method, a first latent variable is scaled based on a first variable scale factor, to obtain a second latent variable, and a quantity of coding bits of an entropy coding result of the second latent variable meets a preset encoding rate condition. This ensures that a quantity of coding bits of an entropy coding result of a latent variable corresponding to each frame of media data can meet the preset encoding rate condition, that is, the quantity of coding bits of the entropy coding result of the latent variable corresponding to each frame of media data can be basically consistent, instead of dynamically changing.
-
公开(公告)号:US20240112684A1
公开(公告)日:2024-04-04
申请号:US18532085
申请日:2023-12-07
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Shuai LIU , Yuan GAO , Bingyin XIA , Bin WANG , Zhe WANG
IPC: G10L19/002 , G10L19/008 , H04S7/00
CPC classification number: G10L19/002 , G10L19/008 , H04S7/00
Abstract: Embodiments of this application disclose a three-dimensional audio signal processing method and apparatus, to implement bit allocation of a signal. The method includes: performing spatial coding on a to-be-coded three-dimensional audio signal, to obtain a transmission channel signal and transmission channel attribute information, where the transmission channel signal includes at least one virtual speaker signal group and at least one residual signal group; and determining a bit allocation ratio of the virtual speaker signal group and a bit allocation ratio of the residual signal group based on the transmission channel attribute information.
-
公开(公告)号:US20220358941A1
公开(公告)日:2022-11-10
申请号:US17864116
申请日:2022-07-13
Applicant: Huawei Technologies Co., Ltd.
Inventor: Bingyin XIA , Jiawei LI , Zhe WANG
Abstract: The present disclosure discloses an audio encoding and decoding method and an audio encoder and decoder. The audio encoding method includes: obtaining a current frame of an audio signal, where the current frame includes a high frequency band signal and a low frequency band signal; obtaining a first encoding parameter based on the high frequency band signal and the low frequency band signal; obtaining a second encoding parameter of the current frame based on the high frequency band signal, where the second encoding parameter includes tone component information; and performing bitstream multiplexing on the first encoding parameter and the second encoding parameter, to obtain an encoded bitstream.
-
公开(公告)号:US20220343927A1
公开(公告)日:2022-10-27
申请号:US17863114
申请日:2022-07-12
Applicant: Huawei Technologies Co., Ltd.
Inventor: Bingyin XIA , Jiawei LI , Zhe WANG
Abstract: Disclosed is an audio coding method, including: obtaining a current frame of an audio signal, which includes a high frequency band signal and a low frequency band signal; obtaining a first encoding parameter based on the high frequency band signal and the low frequency band signal; obtaining a second encoding parameter based on the high frequency band signal, where the second encoding parameter includes tone component information of the high frequency band signal; obtaining a third encoding parameter based on the high frequency band signal, where the third encoding parameter includes sub-band envelope information of a part of a sub-band of the high frequency band signal that needs to be encoded; and performing bitstream multiplexing on the first encoding parameter, the second encoding parameter, and the third encoding parameter, to obtain an encoded bitstream.
-
公开(公告)号:US20240177721A1
公开(公告)日:2024-05-30
申请号:US18423083
申请日:2024-01-25
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Bingyin XIA , Jiawei LI , Zhe WANG
IPC: G10L19/022 , G10L25/30
CPC classification number: G10L19/022 , G10L25/30
Abstract: Embodiments of this application disclose an audio signal encoding and decoding method, including: obtaining, based on spectra of M blocks of a current frame of a to-be-encoded audio signal, M transient state identifiers of the M blocks, where the M blocks include a first block, and a transient state identifier of the first block indicates that the first block is a transient state block, or indicates that the first block is a non-transient state block; obtaining group information of the M blocks based on the M transient state identifiers of the M blocks; performing grouping and arranging on the spectra of the M blocks based on the group information of the M blocks, to obtain a to-be-encoded spectrum of the current frame; encoding the to-be-encoded spectrum by using an encoding neural network to obtain a spectrum encoding result; and writing the spectrum encoding result into a bitstream.
-
8.
公开(公告)号:US20240105189A1
公开(公告)日:2024-03-28
申请号:US18533612
申请日:2023-12-08
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Jiawei LI , Bingyin XIA , Zhe WANG
CPC classification number: G10L19/0204 , G10L19/167 , G10L25/18 , G10L25/21
Abstract: This application disclose an encoding method and apparatus, a decoding method and apparatus, a device, a storage medium, and a computer program, and belong to the field of encoding and decoding technologies. In embodiments of this application, a first whitened spectrum for media data is whitened to obtain a second whitened spectrum, and then encoding is performed based on the second whitened spectrum. A spectral amplitude of the second whitened spectrum in a target frequency band is greater than or equal to a spectral amplitude of the first whitened spectrum in the target frequency band. It can be learned that, in this solution, the spectral amplitude of the first whitened spectrum in the target frequency band is increased, so that a difference between statistical average energy of spectral lines for different frequencies in the obtained second whitened spectrum is small.
-
公开(公告)号:US20230368801A1
公开(公告)日:2023-11-16
申请号:US18224237
申请日:2023-07-20
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Xianbo MENG , Bin WANG , Zhe WANG , Bingyin XIA
IPC: G10L19/002 , G10L19/02 , G10L25/21
CPC classification number: G10L19/002 , G10L19/02 , G10L25/21
Abstract: A bit allocation method and apparatus for an audio object are disclosed, which relate to the field of audio encoding and decoding technologies. The method includes: separately pre-rendering a plurality of audio objects to be pre-rendered in an audio frame, to obtain a plurality of pre-rendered audio objects; obtaining respective perceptual importance parameter values of the plurality of pre-rendered audio objects; obtaining a bit allocation parameter value of a current audio object to be pre-rendered based on the respective perceptual importance parameter values of the plurality of pre-rendered audio objects; and determining, based on the bit allocation parameter value of the current audio object to be pre-rendered and a total quantity of to-be-allocated bits corresponding to the plurality of audio objects to be pre-rendered, a target quantity of bits allocated to the current audio object to be pre-rendered.
-
公开(公告)号:US20230154471A1
公开(公告)日:2023-05-18
申请号:US18153128
申请日:2023-01-11
Applicant: Huawei Technologies Co., Ltd.
Inventor: Zhi WANG , Jiance DING , Bingyin XIA , Bin WANG , Zhe WANG
IPC: G10L19/008 , G10L25/06
CPC classification number: G10L19/008 , G10L25/06
Abstract: A multi-channel audio signal encoding method includes: obtaining a to-be-encoded first audio frame; obtaining a correlation value set, where the correlation value set includes respective correlation values of a plurality of channel pairs; selecting M correlation values from the correlation value set, where all the M correlation values are greater than correlation values other than the M correlation values in the correlation value set, and all the M correlation values are greater than or equal to a pairing threshold; obtaining M channel pair sets; determining a target channel pair set from the M channel pair sets, where a sum of correlation values of all channel pairs in the target channel pair set is the largest in those of the M channel pair sets; and encoding the first audio frame based on the target channel pair set.
-
-
-
-
-
-
-
-
-