Patent search ap:("NVIDIA Corporation") AND inv:"Sihyun Yu" Page 1

1.

发明申请
SYSTEM AND METHOD FOR EFFICIENT TEXT-GUIDED GENERATION OF HIGH-RESOLUTION VIDEOS 有权

公开(公告)号：US20250111552A1

公开(公告)日：2025-04-03

申请号：US18819064

申请日：2024-08-29

Applicant: NVIDIA Corporation

Inventor： Sihyun Yu , Weili Nie , De-An Huang , Boyi Li , Animashree Anandkumar

IPC: G06T11/00 , G06N3/0455 , G06T9/00

Abstract: Systems and methods are disclosed that train a content frame-motion latent diffusion model (CDM) and use the CDM to generate requested videos. The CMD may be a two-stage framework that first compresses videos to a succinct latent space and then learns the video distribution in this latent space. For instance, the CMD may include an autoencoder and two diffusion models. In a first stage, using the autoencoder, a low-dimensional latent decomposition into a content frame and latent motion representation is learned. In the second stage, without adding any new parameters, the content frame distribution may be fine-tuned by using a pretrained image diffusion model, which allows the CMD to leverage the rich visual knowledge in pretrained image diffusion models. In addition, a new lightweight diffusion model may be used to generate motion latent representations that are conditioned on the given content frame.

Patent Agency Ranking