-
公开(公告)号:US20250104290A1
公开(公告)日:2025-03-27
申请号:US18429251
申请日:2024-01-31
Applicant: Erli Ding , Colin Eles , Amir Fruchtman , Riza Alp Guler , Yanyu Li , Xian Liu , Ergeta Muca , Mohammad Rami Koujan , Jian Ren , Dhritiman Sagar , Aliaksandr Siarohin , Ivan Skorokhodov , Sergey Tulyakov
Inventor: Erli Ding , Colin Eles , Amir Fruchtman , Riza Alp Guler , Yanyu Li , Xian Liu , Ergeta Muca , Mohammad Rami Koujan , Jian Ren , Dhritiman Sagar , Aliaksandr Siarohin , Ivan Skorokhodov , Sergey Tulyakov
Abstract: Examples described herein relate to automatic image generation. A plurality of inputs is accessed. The inputs include first input data and second input data. The first input data includes a text prompt describing a desired image and the second input data is indicative of one or more structural features of the desired image. One or more intermediate outputs are generated via a first generative machine learning model that uses the plurality of inputs as first control signals. An output image is generated via a second generative machine learning model that uses at least a subset of the plurality of inputs and at least a subset of the one or more intermediate outputs as second control signals. The output image is presented at a user device of a user.