Patent search ap:("Lemon Inc.") AND inv:"Wei Min Wang" Page 1

1.

发明公开
VIDEO GENERATION WITH LATENT DIFFUSION MODELS 审中-公开

公开(公告)号：US20240169479A1

公开(公告)日：2024-05-23

申请号：US18056444

申请日：2022-11-17

Applicant: Lemon Inc.

Inventor： Wei Min Wang , Daquan Zhou , Jiashi Feng

IPC: G06T3/40

CPC classification number: G06T3/4007 , G06T3/4053

Abstract: The present disclosure provides systems and methods for video generation using latent diffusion machine learning models. Given a text input, video data relevant to the text input can be generated using a latent diffusion model. The process includes generating a predetermined number of key frames using text-to-image generation tasks performed within a latent space via a variational auto-encoder, enabling faster training and sampling times compared to pixel space-based diffusion models. The process further includes utilizing two-dimensional convolutions and associated adaptors to learn features for a given frame. Temporal information for the frames can be learned via a directed temporal attention module used to capture the relation among frames and to generate a temporally meaningful sequence of frames. Additional frames can be generated via a frame interpolation process for inserting one or more transition frames between two generated frames. The process can also include a super-resolution process for upsampling the frames.

Patent Agency Ranking