Diverse Image Inpainting Using Contrastive Learning

    公开(公告)号:US20230342884A1

    公开(公告)日:2023-10-26

    申请号:US17725818

    申请日:2022-04-21

    Applicant: Adobe Inc.

    Abstract: An image inpainting system is described that receives an input image that includes a masked region. From the input image, the image inpainting system generates a synthesized image that depicts an object in the masked region by selecting a first code that represents a known factor characterizing a visual appearance of the object and a second code that represents an unknown factor characterizing the visual appearance of the object apart from the known factor in latent space. The input image, the first code, and the second code are provided as input to a generative adversarial network that is trained to generate the synthesized image using contrastive losses. Different synthesized images are generated from the same input image using different combinations of first and second codes, and the synthesized images are output for display.

    GENERATING SYNTHESIZED DIGITAL IMAGES UTILIZING A MULTI-RESOLUTION GENERATOR NEURAL NETWORK

    公开(公告)号:US20230053588A1

    公开(公告)日:2023-02-23

    申请号:US17400426

    申请日:2021-08-12

    Applicant: Adobe Inc.

    Abstract: This disclosure describes methods, non-transitory computer readable storage media, and systems that generate synthetized digital images via multi-resolution generator neural networks. The disclosed system extracts multi-resolution features from a scene representation to condition a spatial feature tensor and a latent code to modulate an output of a generator neural network. For example, the disclosed systems utilizes a base encoder of the generator neural network to generate a feature set from a semantic label map of a scene. The disclosed system then utilizes a bottom-up encoder to extract multi-resolution features and generate a latent code from the feature set. Furthermore, the disclosed system determines a spatial feature tensor by utilizing a top-down encoder to up-sample and aggregate the multi-resolution features. The disclosed system then utilizes a decoder to generate a synthesized digital image based on the spatial feature tensor and the latent code.

    SYNTHESIZING DIGITAL IMAGES UTILIZING IMAGE-GUIDED MODEL INVERSION OF AN IMAGE CLASSIFIER

    公开(公告)号:US20220261972A1

    公开(公告)日:2022-08-18

    申请号:US17178681

    申请日:2021-02-18

    Applicant: Adobe Inc.

    Abstract: This disclosure describes methods, non-transitory computer readable storage media, and systems that utilize image-guided model inversion of an image classifier with a discriminator. The disclosed systems utilize a neural network image classifier to encode features of an initial image and a target image. The disclosed system also reduces a feature distance between the features of the initial image and the features of the target image at a plurality of layers of the neural network image classifier by utilizing a feature distance regularizer. Additionally, the disclosed system reduces a patch difference between image patches of the initial image and image patches of the target image by utilizing a patch-based discriminator with a patch consistency regularizer. The disclosed system then generates a synthesized digital image based on the constrained feature set and constrained image patches of the initial image.

    Generating a modified digital image utilizing a human inpainting model

    公开(公告)号:US12260530B2

    公开(公告)日:2025-03-25

    申请号:US18190544

    申请日:2023-03-27

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that modify digital images via scene-based editing using image understanding facilitated by artificial intelligence. For example, in one or more embodiments the disclosed systems utilize generative machine learning models to create modified digital images portraying human subjects. In particular, the disclosed systems generate modified digital images by performing infill modifications to complete a digital image or human inpainting for portions of a digital image that portrays a human. Moreover, in some embodiments, the disclosed systems perform reposing of subjects portrayed within a digital image to generate modified digital images. In addition, the disclosed systems in some embodiments perform facial expression transfer and facial expression animations to generate modified digital images or animations.

    UTILIZING INDIVIDUAL-CONCEPT TEXT-IMAGE ALIGNMENT TO ENHANCE COMPOSITIONAL CAPACITY OF TEXT-TO-IMAGE MODELS

    公开(公告)号:US20250078327A1

    公开(公告)日:2025-03-06

    申请号:US18457895

    申请日:2023-08-29

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that utilize a text-image alignment loss to train a diffusion model to generate digital images from input text. In particular, in some embodiments, the disclosed systems generate a prompt noise representation form a text prompt with a first text concept and a second text concept using a denoising step of a diffusion neural network. Further, in some embodiments, the disclosed systems generate a first concept noise representation from the first text concept and a second concept noise representation from the second text concept. Moreover, the disclosed systems combine the first and second concept noise representation to generate a combined concept noise representation. Accordingly, in some embodiments, by comparing the combined concept noise representation and the prompt noise representation, the disclosed systems modify parameters of the diffusion neural network.

    ONE-CLICK IMAGE EXTENSION WITH QUICK MASK ADJUSTMENT

    公开(公告)号:US20240331214A1

    公开(公告)日:2024-10-03

    申请号:US18610861

    申请日:2024-03-20

    Applicant: ADOBE INC.

    Abstract: Systems and methods for image processing (e.g., image extension or image uncropping) using neural networks are described. One or more aspects include obtaining an image (e.g., a source image, a user provided image, etc.) having an initial aspect ratio, and identifying a target aspect ratio (e.g., via user input) that is different from the initial aspect ratio. The image may be positioned in an image frame having the target aspect ratio, where the image frame includes an image region containing the image and one or more extended regions outside the boundaries of the image. An extended image may be generated (e.g., using a generative neural network), where the extended image includes the image in the image region as well as generated image portions in the extended regions and the one or more generated image portions comprise an extension of a scene element depicted in the image.

    GENERATIVE MODEL FOR MULTI-MODALITY OUTPUTS FROM A SINGLE INPUT

    公开(公告)号:US20240135672A1

    公开(公告)日:2024-04-25

    申请号:US17971169

    申请日:2022-10-20

    Applicant: Adobe Inc.

    CPC classification number: G06V10/70 G06N3/0454 G06T11/001 G06T15/08

    Abstract: An image generation system implements a multi-branch GAN to generate images that each express visually similar content in a different modality. A generator portion of the multi-branch GAN includes multiple branches that are each tasked with generating one of the different modalities. A discriminator portion of the multi-branch GAN includes multiple fidelity discriminators, one for each of the generator branches, and a consistency discriminator, which constrains the outputs generated by the different generator branches to appear visually similar to one another. During training, outputs from each of the fidelity discriminators and the consistency discriminator are used to compute a non-saturating GAN loss. The non-saturating GAN loss is used to refine parameters of the multi-branch GAN during training until model convergence. The trained multi-branch GAN generates multiple images from a single input, where each of the multiple images depicts visually similar content expressed in a different modality.

Patent Agency Ranking