GENERATIVE MODEL FOR MULTI-MODALITY OUTPUTS FROM A SINGLE INPUT

Invention Publication

US20240135672A1 GENERATIVE MODEL FOR MULTI-MODALITY OUTPUTS FROM A SINGLE INPUT 审中-公开

Please log in to see more content

Patent Title: GENERATIVE MODEL FOR MULTI-MODALITY OUTPUTS FROM A SINGLE INPUT
Application No.: US17971169

Application Date: 2022-10-20
Publication No.: US20240135672A1

Publication Date: 2024-04-25
Inventor: Yijun Li , Zhixin Shu , Zhen Zhu , Krishna Kumar Singh
Applicant: Adobe Inc.
Applicant Address: US CA San Jose
Assignee: Adobe Inc.
Current Assignee: Adobe Inc.
Current Assignee Address: US CA San Jose
Main IPC: G06V10/70
IPC: G06V10/70 ; G06N3/04 ; G06T11/00 ; G06T15/08

GENERATIVE MODEL FOR MULTI-MODALITY OUTPUTS FROM A SINGLE INPUT

Abstract:

An image generation system implements a multi-branch GAN to generate images that each express visually similar content in a different modality. A generator portion of the multi-branch GAN includes multiple branches that are each tasked with generating one of the different modalities. A discriminator portion of the multi-branch GAN includes multiple fidelity discriminators, one for each of the generator branches, and a consistency discriminator, which constrains the outputs generated by the different generator branches to appear visually similar to one another. During training, outputs from each of the fidelity discriminators and the consistency discriminator are used to compute a non-saturating GAN loss. The non-saturating GAN loss is used to refine parameters of the multi-branch GAN during training until model convergence. The trained multi-branch GAN generates multiple images from a single input, where each of the multiple images depicts visually similar content expressed in a different modality.

Public/Granted literature

US20240233318A9 GENERATIVE MODEL FOR MULTI-MODALITY OUTPUTS FROM A SINGLE INPUT Public/Granted day:2024-07-11

Information query

Global Dossier Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06V	图像或视频识别或理解
G06V10/00	图像或视频识别或理解的安排（图像或视频中的字符识别 G06V30/10）
G06V10/70	.使用模式识别或机器学习（光学模式识别或电子计算 G06V10/88）