METHOD, APPARATUS AND COMPUTER PROGRAM PRODUCT FOR PROVIDING AN ATTENTION BLOCK FOR NEURAL NETWORK-BASED IMAGE AND VIDEO COMPRESSION

    公开(公告)号:US20240289590A1

    公开(公告)日:2024-08-29

    申请号:US18572100

    申请日:2022-06-16

    CPC classification number: G06N3/045

    Abstract: Various embodiments provide a method, an apparatus, and computer program product. The method comprising: defining an attention block comprising: a set of initial neural network layers, wherein each layer is caused to process an output of a previous layer, and wherein a first layer processes an input of a dense split attention block; core attention blocks process one or more outputs of the set of initial neural network layers; a concatenation block for concatenating one or more outputs of the core attention blocks and at least one intermediate output of the set of initial neural network layers; one or more final neural network layers process at least the output of the concatenation block; and a summation block caused to sum an output of the final neural network layers and an input to the attention block; and providing an output of the summation block as a final output of the attention block.

    TRANSFORMER BASED VIDEO CODING
    6.
    发明公开

    公开(公告)号:US20240267543A1

    公开(公告)日:2024-08-08

    申请号:US18425693

    申请日:2024-01-29

    CPC classification number: H04N19/30 H04N19/172 H04N19/88

    Abstract: An example method includes: receiving a target frame and one or more reference frames; extracting a first feature map from a first predicted target frame predicted from a first reference frame, and a second feature map from a second predicted frame predicted from a second target frame, wherein the first predicted target frame is a backward predicted target frame and the second predicted target frame is a forward predicted target frame; generating a refined residual feature based at least on the first feature map, the second feature map, and a third feature map extracted from a feature decoder net module or circuit; generating a frame residual based at least on the refined residual feature; and generating an output reconstructed frame based at least on the frame residual and an average frame, wherein the average frame represents an average of the first predicted target frame and the second predicted target frame.

Patent Agency Ranking