Cross-modal contrastive learning for text-to-image generation based on machine learning models

Invention Grant

US12067646B2 Cross-modal contrastive learning for text-to-image generation based on machine learning models 有权

Please log in to see more content

Patent Title: Cross-modal contrastive learning for text-to-image generation based on machine learning models
Application No.: US17467628

Application Date: 2021-09-07
Publication No.: US12067646B2

Publication Date: 2024-08-20
Inventor: Han Zhang , Jing Yu Koh , Jason Michael Baldridge , Yinfei Yang , Honglak Lee
Applicant: Google LLC
Applicant Address: US CA Mountain View
Assignee: Google LLC
Current Assignee: Google LLC
Current Assignee Address: US CA Mountain View
Agency: McDonnell Boehnen Hulbert & Berghoff LLP
Main IPC: G06T11/00
IPC: G06T11/00 ; G06F18/214 ; G06F18/22 ; G06N3/08 ; G10L15/26

Cross-modal contrastive learning for text-to-image generation based on machine learning models

Abstract:

A computer-implemented method includes receiving, by a computing device, a particular textual description of a scene. The method also includes applying a neural network for text-to-image generation to generate an output image rendition of the scene, the neural network having been trained to cause two image renditions associated with a same textual description to attract each other and two image renditions associated with different textual descriptions to repel each other based on mutual information between a plurality of corresponding pairs, wherein the plurality of corresponding pairs comprise an image-to-image pair and a text-to-image pair. The method further includes predicting the output image rendition of the scene.

Public/Granted literature

US20230081171A1 Cross-Modal Contrastive Learning for Text-to-Image Generation based on Machine Learning Models Public/Granted day:2023-03-16

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06T	一般的图像数据处理或产生
G06T11/00	2D〔二维〕图像的生成