DEBIASING IMAGE TO IMAGE TRANSLATION MODELS
    31.
    发明公开

    公开(公告)号:US20240046412A1

    公开(公告)日:2024-02-08

    申请号:US17880120

    申请日:2022-08-03

    Applicant: ADOBE INC.

    CPC classification number: G06T3/4046 G06T3/4053

    Abstract: A system debiases image translation models to produce generated images that contain minority attributes. A balanced batch for a minority attribute is created by over-sampling images having the minority attribute from an image dataset. An image translation model is trained using images from the balanced batch by applying supervised contrastive loss to output of an encoder of the image translation model and an auxiliary classifier loss based on predicted attributes in images generated by a decoder of the image translation model. Once trained, the image translation model is used to generate images with the minority image when given an input image having the minority attribute.

    Generating synthesized digital images utilizing a multi-resolution generator neural network

    公开(公告)号:US11769227B2

    公开(公告)日:2023-09-26

    申请号:US17400426

    申请日:2021-08-12

    Applicant: Adobe Inc.

    CPC classification number: G06T3/4046 G06F18/253 G06N3/04 G06V10/40 G06V30/274

    Abstract: This disclosure describes methods, non-transitory computer readable storage media, and systems that generate synthetized digital images via multi-resolution generator neural networks. The disclosed system extracts multi-resolution features from a scene representation to condition a spatial feature tensor and a latent code to modulate an output of a generator neural network. For example, the disclosed systems utilizes a base encoder of the generator neural network to generate a feature set from a semantic label map of a scene. The disclosed system then utilizes a bottom-up encoder to extract multi-resolution features and generate a latent code from the feature set. Furthermore, the disclosed system determines a spatial feature tensor by utilizing a top-down encoder to up-sample and aggregate the multi-resolution features. The disclosed system then utilizes a decoder to generate a synthesized digital image based on the spatial feature tensor and the latent code.

    MULTIMODAL DIFFUSION MODELS
    34.
    发明公开

    公开(公告)号:US20240265505A1

    公开(公告)日:2024-08-08

    申请号:US18165141

    申请日:2023-02-06

    Applicant: ADOBE INC.

    CPC classification number: G06T5/70 G06T2207/20081 G06T2207/20084

    Abstract: Systems and methods for image processing are described. Embodiments of the present disclosure obtain a noise image and guidance information for generating an image. A diffusion model generates an intermediate noise prediction for the image based on the noise image. A conditioning network generates noise modulation parameters. The intermediate noise prediction and the noise modulation parameters are combined to obtain a modified intermediate noise prediction. The diffusion model generates the image based on the modified intermediate noise prediction, wherein the image depicts a scene based on the guidance information.

    GENERATIVE MODEL FOR MULTI-MODALITY OUTPUTS FROM A SINGLE INPUT

    公开(公告)号:US20240233318A9

    公开(公告)日:2024-07-11

    申请号:US17971169

    申请日:2022-10-21

    Applicant: Adobe Inc.

    CPC classification number: G06V10/70 G06N3/0454 G06T11/001 G06T15/08

    Abstract: An image generation system implements a multi-branch GAN to generate images that each express visually similar content in a different modality. A generator portion of the multi-branch GAN includes multiple branches that are each tasked with generating one of the different modalities. A discriminator portion of the multi-branch GAN includes multiple fidelity discriminators, one for each of the generator branches, and a consistency discriminator, which constrains the outputs generated by the different generator branches to appear visually similar to one another. During training, outputs from each of the fidelity discriminators and the consistency discriminator are used to compute a non-saturating GAN loss. The non-saturating GAN loss is used to refine parameters of the multi-branch GAN during training until model convergence. The trained multi-branch GAN generates multiple images from a single input, where each of the multiple images depicts visually similar content expressed in a different modality.

Patent Agency Ranking