-
公开(公告)号:US20240430482A1
公开(公告)日:2024-12-26
申请号:US18823504
申请日:2024-09-03
Inventor: Semih ESENLIK , Yaojun Wu , Zhaobin Zhang , Yue Li , Kai Zhang , Li Zhang
IPC: H04N19/61 , H04N19/124 , H04N19/132 , H04N19/42 , H04N19/463 , H04N19/91
Abstract: Embodiments of the present disclosure provide a solution for visual data processing. A method for visual data processing is proposed. The method comprises: obtaining, for a conversion between visual data and a bitstream of the visual data, an intermediate representation of the visual data, the intermediate representation being different from a quantized latent representation of the visual data and being generated based on at least one of the following: at least one parameter, at least a part of the quantized latent representation, a prediction of the at least a part of the quantized latent representation, or a difference between the prediction and the at least a part of the quantized latent representation; and performing, for the conversion, a synthesis transform on the intermediate representation, wherein the quantized latent representation is generated based on applying a first neural network to the visual data.
-
公开(公告)号:US20250030857A1
公开(公告)日:2025-01-23
申请号:US18828890
申请日:2024-09-09
Inventor: Yaojun Wu , Semih Esenlik , Zhaobin Zhang , Yue Li , Kai Zhang , Li Zhang
IPC: H04N19/126 , H04N19/42
Abstract: Embodiments of the present disclosure provide a solution for visual data processing. A method for visual data processing is proposed. The method comprises: performing, for a conversion between visual data and a bitstream of the visual data, a quantization process on a dataset comprising at least one of: input visual data of a neural network model used for the conversion, or a parameter of the neural network model; and performing the conversion based on the quantization process.
-
公开(公告)号:US20250119552A1
公开(公告)日:2025-04-10
申请号:US18982647
申请日:2024-12-16
Applicant: Douyin Vision (Beijing) Co., Ltd. , Bytedance Inc.
Inventor: Zhaobin Zhang , Semih Esenlik , Yaojun Wu , Yue Li , Kai Zhang , Li Zhang
IPC: H04N19/149 , H04N19/122 , H04N19/186
Abstract: A mechanism for processing video data is disclosed. A determination is made to resize an image with a first size to create a resized image with a second size. A conversion is performed between a visual media data and a bitstream based on the resized image. The conversion includes applying a neural network-based coding model to the resized image to achieve variable rate neural network-based compression.
-
公开(公告)号:US20250159258A1
公开(公告)日:2025-05-15
申请号:US19022954
申请日:2025-01-15
Applicant: Douyin Vision (Beijing) Co., Ltd. , Bytedance Inc.
Inventor: Semih Esenlik , Zhaobin Zhang , Yaojun Wu , Yue Li , Kai Zhang , Li Zhang
IPC: H04N19/625 , H04N19/124 , H04N19/63
Abstract: Embodiments of the present disclosure provide a solution for visual data processing. A method for visual data processing is proposed. The method comprises: obtaining, for a conversion between visual data and a bitstream of the visual data, region information indicating positions and sizes of a plurality of regions in a quantized latent representation of the visual data; selecting, based on the region information, a set of target neighboring samples from a plurality of candidate neighboring samples of a current sample in the quantized latent representation, the set of target neighboring samples being in the same region as the current sample; determining statistical information of the current sample based on the set of target neighboring samples; and performing the conversion based on the statistical information.
-
-
-