Invention Grant
- Patent Title: Cross-modal contrastive learning for text-to-image generation based on machine learning models
-
Application No.: US17467628Application Date: 2021-09-07
-
Publication No.: US12067646B2Publication Date: 2024-08-20
- Inventor: Han Zhang , Jing Yu Koh , Jason Michael Baldridge , Yinfei Yang , Honglak Lee
- Applicant: Google LLC
- Applicant Address: US CA Mountain View
- Assignee: Google LLC
- Current Assignee: Google LLC
- Current Assignee Address: US CA Mountain View
- Agency: McDonnell Boehnen Hulbert & Berghoff LLP
- Main IPC: G06T11/00
- IPC: G06T11/00 ; G06F18/214 ; G06F18/22 ; G06N3/08 ; G10L15/26

Abstract:
A computer-implemented method includes receiving, by a computing device, a particular textual description of a scene. The method also includes applying a neural network for text-to-image generation to generate an output image rendition of the scene, the neural network having been trained to cause two image renditions associated with a same textual description to attract each other and two image renditions associated with different textual descriptions to repel each other based on mutual information between a plurality of corresponding pairs, wherein the plurality of corresponding pairs comprise an image-to-image pair and a text-to-image pair. The method further includes predicting the output image rendition of the scene.
Public/Granted literature
- US20230081171A1 Cross-Modal Contrastive Learning for Text-to-Image Generation based on Machine Learning Models Public/Granted day:2023-03-16
Information query
IPC分类:
G | 物理 |
G06 | 计算;推算或计数 |
G06T | 一般的图像数据处理或产生 |
G06T11/00 | 2D〔二维〕图像的生成 |