-
公开(公告)号:US20240137517A1
公开(公告)日:2024-04-25
申请号:US18397302
申请日:2023-12-27
Inventor: Chaoyi Lin , Yue Li , Kai Zhang , Zhaobin Zhang , Li Zhang
IPC: H04N19/132 , G06T3/4046 , G06T3/4053 , H04N19/33 , H04N19/59 , H04N19/88
CPC classification number: H04N19/132 , G06T3/4046 , G06T3/4053 , H04N19/33 , H04N19/59 , H04N19/88
Abstract: A method of processing video data. The method includes applying a super resolution (SR) process to a video unit at a specific position relative to one or more in-loop filters when the one or more in-loop filters are applied to the video unit, and performing a conversion between a video comprising the video unit and a bitstream of the video based on the SR process and the one or more in-loop filters as applied. A corresponding video coding apparatus and non-transitory computer-readable recording medium are also disclosed.
-
公开(公告)号:US20250168370A1
公开(公告)日:2025-05-22
申请号:US19033178
申请日:2025-01-21
Applicant: Bytedance Inc.
Inventor: Semih Esenlik , Zhaobin Zhang , Kai Zhang , Li Zhang
IPC: H04N19/192 , H04N19/124 , H04N19/70
Abstract: An image decoding method including transforming an input image into latent samples using an analysis transform; quantizing the latent samples using a hyper encoder to generate quantized hyper latent samples; encoding the quantized hyper latent samples into a bitstream using entropy encoding; applying a latent sample prediction process to obtain quantized latent samples and quantized residual latent samples based on the latent samples using the quantized hyper latent samples; obtaining prediction samples following the latent sample prediction process; and entropy encoding the quantized hyper latent samples and the quantized residual latent samples into the bitstream.
-
公开(公告)号:US20240430482A1
公开(公告)日:2024-12-26
申请号:US18823504
申请日:2024-09-03
Inventor: Semih ESENLIK , Yaojun Wu , Zhaobin Zhang , Yue Li , Kai Zhang , Li Zhang
IPC: H04N19/61 , H04N19/124 , H04N19/132 , H04N19/42 , H04N19/463 , H04N19/91
Abstract: Embodiments of the present disclosure provide a solution for visual data processing. A method for visual data processing is proposed. The method comprises: obtaining, for a conversion between visual data and a bitstream of the visual data, an intermediate representation of the visual data, the intermediate representation being different from a quantized latent representation of the visual data and being generated based on at least one of the following: at least one parameter, at least a part of the quantized latent representation, a prediction of the at least a part of the quantized latent representation, or a difference between the prediction and the at least a part of the quantized latent representation; and performing, for the conversion, a synthesis transform on the intermediate representation, wherein the quantized latent representation is generated based on applying a first neural network to the visual data.
-
公开(公告)号:US20240236325A9
公开(公告)日:2024-07-11
申请号:US18399926
申请日:2023-12-29
Inventor: Chaoyi Lin , Yue Li , Kai Zhang , Zhaobin Zhang , Li Zhang
IPC: H04N19/132 , H04N19/154 , H04N19/186
CPC classification number: H04N19/132 , H04N19/154 , H04N19/186 , H04N19/625
Abstract: A method of processing video data. The method includes down-sampling a video unit of a video prior to application of a super resolution (SR) process and performing a conversion between the video including the video unit and a bitstream of the video based on the video unit as down-sampled. A corresponding video coding apparatus and non-transitory computer-readable recording medium are also disclosed.
-
公开(公告)号:US20250159214A1
公开(公告)日:2025-05-15
申请号:US19021539
申请日:2025-01-15
Applicant: Bytedance Inc.
Inventor: Zhaobin Zhang , Semih Esenlik , Kai Zhang , Li Zhang
IPC: H04N19/189 , H04N19/119 , H04N19/176
Abstract: An image decoding method including obtaining reconstructed latents ŷ[:,:,:] using an arithmetic decoder; feeding the reconstructed latents into a synthesis neural network; tile partitioning output feature maps into multiple parts based on decoded parameters at one or multiple locations; separately feeding each of the multiple parts into a next stage of a plurality of convolutional layers to obtain spatially partitioned feature maps at an output; and cropping and stitching the spatially partitioned feature maps back to a whole feature map spatially until an image is reconstructed.
-
公开(公告)号:US20250119552A1
公开(公告)日:2025-04-10
申请号:US18982647
申请日:2024-12-16
Applicant: Douyin Vision (Beijing) Co., Ltd. , Bytedance Inc.
Inventor: Zhaobin Zhang , Semih Esenlik , Yaojun Wu , Yue Li , Kai Zhang , Li Zhang
IPC: H04N19/149 , H04N19/122 , H04N19/186
Abstract: A mechanism for processing video data is disclosed. A determination is made to resize an image with a first size to create a resized image with a second size. A conversion is performed between a visual media data and a bitstream based on the resized image. The conversion includes applying a neural network-based coding model to the resized image to achieve variable rate neural network-based compression.
-
公开(公告)号:US20240414381A1
公开(公告)日:2024-12-12
申请号:US18807764
申请日:2024-08-16
Inventor: Yaojun WU , Yue Li , Zhaobin Zhang , Semih Esenlik , Kai Zhang , Li Zhang
Abstract: Embodiments of the present disclosure provide a solution for data processing. A method for data processing is proposed. The method comprises: determining, by using a first model with an attention mechanism during a conversion between data and a bitstream of the data, a probability distribution for entropy coding associated with the bitstream; and performing the conversion based on the probability distribution.
-
公开(公告)号:US20240137519A1
公开(公告)日:2024-04-25
申请号:US18399926
申请日:2023-12-29
Inventor: Chaoyi Lin , Yue Li , Kai Zhang , Zhaobin Zhang , Li Zhang
IPC: H04N19/132 , H04N19/154 , H04N19/186
CPC classification number: H04N19/132 , H04N19/154 , H04N19/186 , H04N19/625
Abstract: A method of processing video data. The method includes down-sampling a video unit of a video prior to application of a super resolution (SR) process and performing a conversion between the video including the video unit and a bitstream of the video based on the video unit as down-sampled. A corresponding video coding apparatus and non-transitory computer-readable recording medium are also disclosed.
-
公开(公告)号:US20250168369A1
公开(公告)日:2025-05-22
申请号:US19033120
申请日:2025-01-21
Applicant: Bytedance Inc.
Inventor: Semih Esenlik , Zhaobin Zhang , Kai Zhang , Li Zhang
IPC: H04N19/192 , H04N19/124 , H04N19/159 , H04N19/184
Abstract: A mechanism for processing video data is in a neural network disclosed. The mechanism includes obtaining quantized residual latent samples. The quantized residual latent samples are processed to obtain processed quantized residual latent samples. A reconstructed latent sample can then be acquired based on the processed quantized residual latent sample.
-
公开(公告)号:US20250159258A1
公开(公告)日:2025-05-15
申请号:US19022954
申请日:2025-01-15
Applicant: Douyin Vision (Beijing) Co., Ltd. , Bytedance Inc.
Inventor: Semih Esenlik , Zhaobin Zhang , Yaojun Wu , Yue Li , Kai Zhang , Li Zhang
IPC: H04N19/625 , H04N19/124 , H04N19/63
Abstract: Embodiments of the present disclosure provide a solution for visual data processing. A method for visual data processing is proposed. The method comprises: obtaining, for a conversion between visual data and a bitstream of the visual data, region information indicating positions and sizes of a plurality of regions in a quantized latent representation of the visual data; selecting, based on the region information, a set of target neighboring samples from a plurality of candidate neighboring samples of a current sample in the quantized latent representation, the set of target neighboring samples being in the same region as the current sample; determining statistical information of the current sample based on the set of target neighboring samples; and performing the conversion based on the statistical information.
-
-
-
-
-
-
-
-
-