- 专利标题: MULTILINGUAL TEXT-TO-IMAGE GENERATION
-
申请号: US18296002申请日: 2023-04-05
-
公开(公告)号: US20240338859A1公开(公告)日: 2024-10-10
- 发明人: Venkata Naveen Kumar Yadav Marri , Ajinkya Gorakhnath Kale
- 申请人: ADOBE INC.
- 申请人地址: US CA SAN JOSE
- 专利权人: ADOBE INC.
- 当前专利权人: ADOBE INC.
- 当前专利权人地址: US CA SAN JOSE
- 主分类号: G06T11/00
- IPC分类号: G06T11/00 ; G06F40/58 ; G06V10/74 ; G06V10/774 ; G06V10/82
摘要:
Systems and methods for image processing are provided. One aspect of the systems and methods includes obtaining a text prompt in a first language. Another aspect of the systems and methods includes encoding the text prompt using a multilingual encoder to obtain a multilingual text embedding. Yet another aspect of the systems and methods includes processing the multilingual text embedding using a diffusion prior model to obtain an image embedding, wherein the diffusion prior model is trained to process multilingual text embeddings from the first language and a second language based on training data from the first language and the second language. Yet another aspect of the systems and methods includes generating an image using a diffusion model based on the image embedding, wherein the image includes an element corresponding to the text prompt.
信息查询
IPC分类:
G | 物理 |
G06 | 计算;推算或计数 |
G06T | 一般的图像数据处理或产生 |
G06T11/00 | 2D〔二维〕图像的生成 |