-
公开(公告)号:US20250166135A1
公开(公告)日:2025-05-22
申请号:US18951203
申请日:2024-11-18
Applicant: Google LLC
Inventor: Yu-Chuan Su , Hsin-Ping Huang , Ming-Hsuan Yang , Deqing Sun , Lu Jiang , Yukun Zhu , Xuhui Jia
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for controllable video generation. One of the methods includes receiving a text prompt that specifies an object; receiving a control input that comprises an image that depicts a particular instance of the object; generating a video that comprises a respective video frame at each of a plurality of time steps in the video and that depicts the particular instance of the object. Generating the video includes, at each of the plurality of time steps: obtaining a text prompt embedding; obtaining a control input embedding; and generating the respective video frame at the time step using a video generation neural network while the video generation neural network is conditioned on the text prompt embedding and on the control input embedding.