VIDEO GENERATION WITH LATENT DIFFUSION MODELS

    公开(公告)号:US20240169479A1

    公开(公告)日:2024-05-23

    申请号:US18056444

    申请日:2022-11-17

    Applicant: Lemon Inc.

    CPC classification number: G06T3/4007 G06T3/4053

    Abstract: The present disclosure provides systems and methods for video generation using latent diffusion machine learning models. Given a text input, video data relevant to the text input can be generated using a latent diffusion model. The process includes generating a predetermined number of key frames using text-to-image generation tasks performed within a latent space via a variational auto-encoder, enabling faster training and sampling times compared to pixel space-based diffusion models. The process further includes utilizing two-dimensional convolutions and associated adaptors to learn features for a given frame. Temporal information for the frames can be learned via a directed temporal attention module used to capture the relation among frames and to generate a temporally meaningful sequence of frames. Additional frames can be generated via a frame interpolation process for inserting one or more transition frames between two generated frames. The process can also include a super-resolution process for upsampling the frames.

    AUTOMATICALLY AND EFFICIENTLY GENERATING SEARCH SPACES FOR NEURAL NETWORK

    公开(公告)号:US20220398450A1

    公开(公告)日:2022-12-15

    申请号:US17348246

    申请日:2021-06-15

    Applicant: Lemon Inc.

    Abstract: A super-network comprising a plurality of layers may be generated. Each layer may comprise cells with different structures. A predetermined number of cells from each layer may be selected. A plurality of cells may be generated based on selected cells using a local mutation model, wherein the local mutation model comprises a mutation window for removing redundant edges from each selected cell. Performance of the plurality of cells may be evaluated using a differentiable fitness scoring function. The operations of the generating a plurality of cells using the local mutation model, the evaluating performance of the plurality of cells using the differentiable fitness scoring function and the selecting the subset of cells based on the evaluation results may be iteratively performed until the super-network converges. A search space for each layer may be generated based on a predetermined top number of cells with largest fitness scores after the super-network converges.

Patent Agency Ranking