Conditional axial transformer layers for high-fidelity image transformation

    公开(公告)号:US12182965B2

    公开(公告)日:2024-12-31

    申请号:US17449162

    申请日:2021-09-28

    Applicant: Google LLC

    Abstract: Apparatus and methods relate to receiving an input image comprising an array of pixels, wherein the input image is associated with a first characteristic; applying a neural network to transform the input image to an output image associated with a second characteristic by generating, by an encoder and for each pixel of the array of pixels of the input image, an encoded pixel, providing, to a decoder, the array of encoded pixels, applying, by the decoder, axial attention to decode a given pixel, wherein the axial attention comprises a row attention or a column attention applied to one or more previously decoded pixels in rows or columns preceding a row or column associated with the given pixel, wherein the row or column attention mixes information within a respective row or column, and maintains independence between respective different rows or different columns; and generating, by the neural network, the output image.

    AUTO-REGRESSIVE VIDEO GENERATION NEURAL NETWORKS

    公开(公告)号:US20250061608A1

    公开(公告)日:2025-02-20

    申请号:US18938770

    申请日:2024-11-06

    Applicant: Google LLC

    Abstract: A method for generating a video is described. The method includes: generating an initial output video including multiple frames, each of the frames having multiple channels; identifying a partitioning of the initial output video into a set of channel slices that are indexed according to a particular slice order, each channel slice being a down sampling of a channel stack from a set of channel stacks; initializing, for each channel stack in the set of channel stacks, a set of fully-generated channel slices; repeatedly processing, using an encoder and a decoder, a current output video to generate a next fully-generated channel slice to be added to the current set of fully-generated channel slices; generating, for each channel index, a respective fully-generated channel stack using the respective fully generated channel slices; and generating a fully-generated output video using the fully-generated channel stacks.

    Auto-regressive video generation neural networks

    公开(公告)号:US12142015B2

    公开(公告)日:2024-11-12

    申请号:US17609668

    申请日:2020-05-22

    Applicant: Google LLC

    Abstract: A method for generating a video is described. The method includes: generating an initial output video including multiple frames, each of the frames having multiple channels; identifying a partitioning of the initial output video into a set of channel slices that are indexed according to a particular slice order, each channel slice being a down sampling of a channel stack from a set of channel stacks; initializing, for each channel stack in the set of channel stacks, a set of fully-generated channel slices; repeatedly processing, using an encoder and a decoder, a current output video to generate a next fully-generated channel slice to be added to the current set of fully-generated channel slices; generating, for each channel index, a respective fully-generated channel stack using the respective fully generated channel slices; and generating a fully-generated output video using the fully-generated channel stacks.

Patent Agency Ranking