Modifying digital images utilizing a language guided image editing model

    公开(公告)号:US12248796B2

    公开(公告)日:2025-03-11

    申请号:US17384109

    申请日:2021-07-23

    Applicant: Adobe Inc.

    Inventor: Ning Xu Zhe Lin

    Abstract: This disclosure describes one or more implementations of systems, non-transitory computer-readable media, and methods that perform language guided digital image editing utilizing a cycle-augmentation generative-adversarial neural network (CAGAN) that is augmented using a cross-modal cyclic mechanism. For example, the disclosed systems generate an editing description network that generates language embeddings which represent image transformations applied between a digital image and a modified digital image. The disclosed systems can further train a GAN to generate modified images by providing an input image and natural language embeddings generated by the editing description network (representing various modifications to the digital image from a ground truth modified image). In some instances, the disclosed systems also utilize an image request attention approach with the GAN to generate images that include adaptive edits in different spatial locations of the image.

    Generating embeddings for text and image queries within a common embedding space for visual-text image searches

    公开(公告)号:US12235891B2

    公开(公告)日:2025-02-25

    申请号:US17809503

    申请日:2022-06-28

    Applicant: Adobe Inc.

    Abstract: Systems, methods, and non-transitory computer-readable media implements related image search and image modification processes using various search engines and a consolidated graphical user interface. For instance, one or more embodiments involve receiving an input digital image and search input and further modifying the input digital image using the image search results retrieved in response to the search input. In some cases, the search input includes a multi-modal search input having multiple queries (e.g., an image query and a text query), and one or more embodiments involve retrieving the image search results utilizing a weighted combination of the queries. Some implementations involve generating an input embedding for the search input (e.g., the multi-modal search input) and retrieving the image search results using the input embedding.

    DIGITAL IMAGE INPAINTING UTILIZING GLOBAL AND LOCAL MODULATION LAYERS OF AN INPAINTING NEURAL NETWORK

    公开(公告)号:US20250054116A1

    公开(公告)日:2025-02-13

    申请号:US18929330

    申请日:2024-10-28

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media that generate inpainted digital images utilizing a cascaded modulation inpainting neural network. For example, the disclosed systems utilize a cascaded modulation inpainting neural network that includes cascaded modulation decoder layers. For example, in one or more decoder layers, the disclosed systems start with global code modulation that captures the global-range image structures followed by an additional modulation that refines the global predictions. Accordingly, in one or more implementations, the image inpainting system provides a mechanism to correct distorted local details. Furthermore, in one or more implementations, the image inpainting system leverages fast Fourier convolutions block within different resolution layers of the encoder architecture to expand the receptive field of the encoder and to allow the network encoder to better capture global structure.

    Extracting attributes from arbitrary digital images utilizing a multi-attribute contrastive classification neural network

    公开(公告)号:US12136250B2

    公开(公告)日:2024-11-05

    申请号:US17332734

    申请日:2021-05-27

    Applicant: Adobe Inc.

    Abstract: This disclosure describes one or more implementations of systems, non-transitory computer-readable media, and methods that extract multiple attributes from an object portrayed in a digital image utilizing a multi-attribute contrastive classification neural network. For example, the disclosed systems utilize a multi-attribute contrastive classification neural network that includes an embedding neural network, a localizer neural network, a multi-attention neural network, and a classifier neural network. In some cases, the disclosed systems train the multi-attribute contrastive classification neural network utilizing a multi-attribute, supervised-contrastive loss. In some embodiments, the disclosed systems generate negative attribute training labels for labeled digital images utilizing positive attribute labels that correspond to the labeled digital images.

    Multi-scale distillation for low-resolution detection

    公开(公告)号:US12136185B2

    公开(公告)日:2024-11-05

    申请号:US17455134

    申请日:2021-11-16

    Applicant: ADOBE INC.

    Abstract: Systems and methods for image processing are described. The systems and methods include receiving a low-resolution image; generating a feature map based on the low-resolution image using an encoder of a student network, wherein the encoder of the student network is trained based on comparing a predicted feature map from the encoder of the student network and a fused feature map from a teacher network, and wherein the fused feature map represents a combination of first feature map from a high-resolution encoder of the teacher network and a second feature map from a low-resolution encoder of the teacher network; and decoding the feature map to obtain prediction information for the low-resolution image.

    PERSONALIZED TEXT-TO-IMAGE GENERATION
    306.
    发明公开

    公开(公告)号:US20240355022A1

    公开(公告)日:2024-10-24

    申请号:US18476504

    申请日:2023-09-28

    Applicant: ADOBE INC.

    CPC classification number: G06T11/60 G06T7/194 G06T9/00 G06T2207/20081

    Abstract: One or more aspects of a method, apparatus, and non-transitory computer readable medium include obtaining an input description and an input image depicting a subject, encoding the input description using a text encoder of an image generation model to obtain a text embedding, and encoding the input image using a subject encoder of the image generation model to obtain a subject embedding. A guidance embedding is generated by combining the subject embedding and the text embedding, and then an output image is generated based on the guidance embedding using a diffusion model of the image generation model. The output image depicts aspects of the subject and the input description.

Patent Agency Ranking