-
公开(公告)号:US20250168369A1
公开(公告)日:2025-05-22
申请号:US19033120
申请日:2025-01-21
Applicant: Bytedance Inc.
Inventor: Semih Esenlik , Zhaobin Zhang , Kai Zhang , Li Zhang
IPC: H04N19/192 , H04N19/124 , H04N19/159 , H04N19/184
Abstract: A mechanism for processing video data is in a neural network disclosed. The mechanism includes obtaining quantized residual latent samples. The quantized residual latent samples are processed to obtain processed quantized residual latent samples. A reconstructed latent sample can then be acquired based on the processed quantized residual latent sample.
-
公开(公告)号:US20250159258A1
公开(公告)日:2025-05-15
申请号:US19022954
申请日:2025-01-15
Applicant: Douyin Vision (Beijing) Co., Ltd. , Bytedance Inc.
Inventor: Semih Esenlik , Zhaobin Zhang , Yaojun Wu , Yue Li , Kai Zhang , Li Zhang
IPC: H04N19/625 , H04N19/124 , H04N19/63
Abstract: Embodiments of the present disclosure provide a solution for visual data processing. A method for visual data processing is proposed. The method comprises: obtaining, for a conversion between visual data and a bitstream of the visual data, region information indicating positions and sizes of a plurality of regions in a quantized latent representation of the visual data; selecting, based on the region information, a set of target neighboring samples from a plurality of candidate neighboring samples of a current sample in the quantized latent representation, the set of target neighboring samples being in the same region as the current sample; determining statistical information of the current sample based on the set of target neighboring samples; and performing the conversion based on the statistical information.
-
公开(公告)号:US20250030857A1
公开(公告)日:2025-01-23
申请号:US18828890
申请日:2024-09-09
Inventor: Yaojun Wu , Semih Esenlik , Zhaobin Zhang , Yue Li , Kai Zhang , Li Zhang
IPC: H04N19/126 , H04N19/42
Abstract: Embodiments of the present disclosure provide a solution for visual data processing. A method for visual data processing is proposed. The method comprises: performing, for a conversion between visual data and a bitstream of the visual data, a quantization process on a dataset comprising at least one of: input visual data of a neural network model used for the conversion, or a parameter of the neural network model; and performing the conversion based on the quantization process.
-
公开(公告)号:US20250168370A1
公开(公告)日:2025-05-22
申请号:US19033178
申请日:2025-01-21
Applicant: Bytedance Inc.
Inventor: Semih Esenlik , Zhaobin Zhang , Kai Zhang , Li Zhang
IPC: H04N19/192 , H04N19/124 , H04N19/70
Abstract: An image decoding method including transforming an input image into latent samples using an analysis transform; quantizing the latent samples using a hyper encoder to generate quantized hyper latent samples; encoding the quantized hyper latent samples into a bitstream using entropy encoding; applying a latent sample prediction process to obtain quantized latent samples and quantized residual latent samples based on the latent samples using the quantized hyper latent samples; obtaining prediction samples following the latent sample prediction process; and entropy encoding the quantized hyper latent samples and the quantized residual latent samples into the bitstream.
-
公开(公告)号:US20250159214A1
公开(公告)日:2025-05-15
申请号:US19021539
申请日:2025-01-15
Applicant: Bytedance Inc.
Inventor: Zhaobin Zhang , Semih Esenlik , Kai Zhang , Li Zhang
IPC: H04N19/189 , H04N19/119 , H04N19/176
Abstract: An image decoding method including obtaining reconstructed latents ŷ[:,:,:] using an arithmetic decoder; feeding the reconstructed latents into a synthesis neural network; tile partitioning output feature maps into multiple parts based on decoded parameters at one or multiple locations; separately feeding each of the multiple parts into a next stage of a plurality of convolutional layers to obtain spatially partitioned feature maps at an output; and cropping and stitching the spatially partitioned feature maps back to a whole feature map spatially until an image is reconstructed.
-
公开(公告)号:US20250119552A1
公开(公告)日:2025-04-10
申请号:US18982647
申请日:2024-12-16
Applicant: Douyin Vision (Beijing) Co., Ltd. , Bytedance Inc.
Inventor: Zhaobin Zhang , Semih Esenlik , Yaojun Wu , Yue Li , Kai Zhang , Li Zhang
IPC: H04N19/149 , H04N19/122 , H04N19/186
Abstract: A mechanism for processing video data is disclosed. A determination is made to resize an image with a first size to create a resized image with a second size. A conversion is performed between a visual media data and a bitstream based on the resized image. The conversion includes applying a neural network-based coding model to the resized image to achieve variable rate neural network-based compression.
-
公开(公告)号:US20240414381A1
公开(公告)日:2024-12-12
申请号:US18807764
申请日:2024-08-16
Inventor: Yaojun WU , Yue Li , Zhaobin Zhang , Semih Esenlik , Kai Zhang , Li Zhang
Abstract: Embodiments of the present disclosure provide a solution for data processing. A method for data processing is proposed. The method comprises: determining, by using a first model with an attention mechanism during a conversion between data and a bitstream of the data, a probability distribution for entropy coding associated with the bitstream; and performing the conversion based on the probability distribution.
-
-
-
-
-
-