-
公开(公告)号:US20250166135A1
公开(公告)日:2025-05-22
申请号:US18951203
申请日:2024-11-18
Applicant: Google LLC
Inventor: Yu-Chuan Su , Hsin-Ping Huang , Ming-Hsuan Yang , Deqing Sun , Lu Jiang , Yukun Zhu , Xuhui Jia
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for controllable video generation. One of the methods includes receiving a text prompt that specifies an object; receiving a control input that comprises an image that depicts a particular instance of the object; generating a video that comprises a respective video frame at each of a plurality of time steps in the video and that depicts the particular instance of the object. Generating the video includes, at each of the plurality of time steps: obtaining a text prompt embedding; obtaining a control input embedding; and generating the respective video frame at the time step using a video generation neural network while the video generation neural network is conditioned on the text prompt embedding and on the control input embedding.
-
公开(公告)号:US20230222628A1
公开(公告)日:2023-07-13
申请号:US17572923
申请日:2022-01-11
Applicant: Google LLC
Inventor: Yang Zhao , Yu-Chuan Su , Chun-Te Chu , Yandong Li , Marius Renn , Yukun Zhu , Xuhui Jia , Bradley Ray Green
CPC classification number: G06T5/001 , G06V40/168 , G06T2207/30201 , G06T2207/20081 , G06T2207/20084
Abstract: Systems and methods for training a restoration model can leverage training for two sub-tasks to train the restoration model to generate realistic and identity-preserved outputs. The systems and methods can balance the training of the generation task and the reconstruction task to ensure the generated outputs preserve the identity of the original subject while generating realistic outputs. The systems and methods can further leverage a feature quantization model and skip connections to improve the model output and overall training.
-