TRANSFORMER-BASED ARCHITECTURE FOR TRANSFORM CODING OF MEDIA

    公开(公告)号:WO2023049655A1

    公开(公告)日:2023-03-30

    申请号:PCT/US2022/076496

    申请日:2022-09-15

    摘要: Systems and techniques are described herein for processing media data using a neural network system. For instance, a process can include obtaining a latent representation of a frame of encoded image data and generating, by a plurality of decoder transformer layers of a decoder sub-network using the latent representation of the frame of encoded image data as input, a frame of decoded image data. At least one decoder transformer layer of the plurality of decoder transformer layers includes: one or more transformer blocks for generating one or more patches of features and determine self-attention locally within one or more window partitions and shifted window partitions applied over the one or more patches; and a patch un-merging engine for decreasing a respective size of each patch of the one or more patches.

    符号化装置、復号装置、符号化方法、および復号方法

    公开(公告)号:WO2023026645A1

    公开(公告)日:2023-03-02

    申请号:PCT/JP2022/023843

    申请日:2022-06-14

    IPC分类号: H04N19/59

    摘要: 符号化装置(100)は、回路と、メモリとを備え、回路は、第1参照ピクチャリストに複数の第1参照ピクチャ候補の1つ以上を登録し、第1参照ピクチャリストから第1参照ピクチャを選択し、第1参照ピクチャにおける第1参照ブロック、及び、RPRを用いて、カレントブロックを符号化し、RPRでは、第1参照ピクチャのピクチャサイズがカレントピクチャのピクチャサイズとは異なる場合、第1参照ブロックがリサンプリングされ、回路は、複数の第1参照ピクチャ候補のそれぞれについて、当該第1参照ピクチャ候補が第1ピクチャサイズを有する場合、第1参照ピクチャリストに当該第1参照ピクチャ候補を登録する。

    動画像符号化装置、動画像復号装置

    公开(公告)号:WO2022264622A1

    公开(公告)日:2022-12-22

    申请号:PCT/JP2022/015302

    申请日:2022-03-29

    摘要: あらかじめ定められたモデルパラメータの集合から入力動画像に適したモデルパラメータを選択し、適用する解像度逆変換を行う場合、複数のモデルパラメータに適さない動画像は低品質になる可能性がある。有理数倍のスケーリングを行うニューラルネットワークと、有理数倍の補間を行う補間部を備える予測部を備え、参照画像の実際の幅と高さと、対象画像の実際の幅と高さから、上記ニューラルネットワークによる第1のスケーリング倍率と、上記補間部による第2のスケーリング倍率を導出し、上記ニューラルネットワークによる第1のスケーリングと、上記補間部による第2のスケーリングを用いて補間画像を導出する。

    MATRIX-BASED INTRA PREDICTION WITH ASYMMETRIC BINARY TREE

    公开(公告)号:WO2022200130A1

    公开(公告)日:2022-09-29

    申请号:PCT/EP2022/056732

    申请日:2022-03-15

    摘要: Several methods are described to jointly use the ABT (Asymmetric Binary Tree) partitioning mode and Matrix-based Intra Prediction (MIP). In a first embodiment, we propose to forbid the use of the MIP intra prediction mode, for block sizes that are resulted from ABT partitioning. In a second embodiment, we propose to allow the MIP intra prediction for block sizes not equal to a power of two in width or height, by extending the block before MIP and crop the predicted block to the original size after MIP. In a third embodiment, we propose to adapt the down-sampling of the boundary reference samples and the up-sampling of the reduced predicted blocks, to the block sizes introduced by ABT partitioning. In a further embodiment, we set the reduced predicted block to size 8x8 in any case the initial block size is 8 and larger than 8 in a direction.

    METHODS AND APPARATUSES FOR ENCODING/DECODING A VIDEO

    公开(公告)号:WO2022180031A1

    公开(公告)日:2022-09-01

    申请号:PCT/EP2022/054392

    申请日:2022-02-22

    摘要: A method for reconstructing at least one part of a first picture, from at least one part of a second picture is provided, said first picture and said second picture having different sizes. The reconstructing comprising decoding said second picture from a bitstream and determining at least one first sample of said at least one part of the first picture using at least one resampling filter applied to at least one second sample of said at least one part of the decoded second picture. A corresponding apparatus for reconstructing at least one part of a first picture is provided. A method for encoding/decoding a video, and corresponding apparatuses, are provided which comprise the reconstructing at least one part of a first picture, from at least one part of a second picture, said first picture and said second picture having different sizes.

    IMAGE ENCODING AND DECODING, VIDEO ENCODING AND DECODING: METHODS, SYSTEMS AND TRAINING METHODS

    公开(公告)号:WO2022084702A1

    公开(公告)日:2022-04-28

    申请号:PCT/GB2021/052770

    申请日:2021-10-25

    申请人: DEEP RENDER LTD

    摘要: There is disclosed a computer-implemented method for lossy or lossless image or video compression and transmission, the method including the steps of: (i) receiving an input image; (ii) encoding the input image using an encoder trained neural network, to produce a y latent representation; (iii) encoding the y latent representation using a hyperencoder trained neural network, to produce a z hyperlatent representation; (iv) quantizing the z hyperlatent representation using a predetermined entropy parameter to produce a quantized z hyperlatent representation; (v) entropy encoding the quantized z hyperlatent representation into a first bitstream, using predetermined entropy parameters; (vi) processing the quantized z hyperlatent representation using a hyperdecoder trained neural network to obtain a location entropy parameter µy, an entropy scale parameter σy, and a context matrix Ay of the y latent representation; (vii) processing the y latent representation, the location entropy parameter µy and the context matrix Ay , using an implicit encoding solver, to obtain quantized latent residuals; (viii) entropy encoding the quantized latent residuals into a second bitstream, using the entropy scale parameter σy ; and (ix) transmitting the first bitstream and the second bitstream. Related computer-implemented methods, systems, computer-implemented training methods and computer program products are disclosed.