-
公开(公告)号:US12248796B2
公开(公告)日:2025-03-11
申请号:US17384109
申请日:2021-07-23
Applicant: Adobe Inc.
Abstract: This disclosure describes one or more implementations of systems, non-transitory computer-readable media, and methods that perform language guided digital image editing utilizing a cycle-augmentation generative-adversarial neural network (CAGAN) that is augmented using a cross-modal cyclic mechanism. For example, the disclosed systems generate an editing description network that generates language embeddings which represent image transformations applied between a digital image and a modified digital image. The disclosed systems can further train a GAN to generate modified images by providing an input image and natural language embeddings generated by the editing description network (representing various modifications to the digital image from a ground truth modified image). In some instances, the disclosed systems also utilize an image request attention approach with the GAN to generate images that include adaptive edits in different spatial locations of the image.
-
302.
公开(公告)号:US12235891B2
公开(公告)日:2025-02-25
申请号:US17809503
申请日:2022-06-28
Applicant: Adobe Inc.
Inventor: Zhifei Zhang , Zhe Lin
IPC: G06F16/33 , G06F16/3332 , G06F16/532 , G06F16/535
Abstract: Systems, methods, and non-transitory computer-readable media implements related image search and image modification processes using various search engines and a consolidated graphical user interface. For instance, one or more embodiments involve receiving an input digital image and search input and further modifying the input digital image using the image search results retrieved in response to the search input. In some cases, the search input includes a multi-modal search input having multiple queries (e.g., an image query and a text query), and one or more embodiments involve retrieving the image search results utilizing a weighted combination of the queries. Some implementations involve generating an input embedding for the search input (e.g., the multi-modal search input) and retrieving the image search results using the input embedding.
-
303.
公开(公告)号:US20250054116A1
公开(公告)日:2025-02-13
申请号:US18929330
申请日:2024-10-28
Applicant: Adobe Inc.
Inventor: Haitian Zheng , Zhe Lin , Jingwan Lu , Scott Cohen , Elya Shechtman , Connelly Barnes , Jianming Zhang , Ning Xu , Sohrab Amirghodsi
IPC: G06T5/77 , G06T3/4046 , G06V10/40
Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media that generate inpainted digital images utilizing a cascaded modulation inpainting neural network. For example, the disclosed systems utilize a cascaded modulation inpainting neural network that includes cascaded modulation decoder layers. For example, in one or more decoder layers, the disclosed systems start with global code modulation that captures the global-range image structures followed by an additional modulation that refines the global predictions. Accordingly, in one or more implementations, the image inpainting system provides a mechanism to correct distorted local details. Furthermore, in one or more implementations, the image inpainting system leverages fast Fourier convolutions block within different resolution layers of the encoder architecture to expand the receptive field of the encoder and to allow the network encoder to better capture global structure.
-
公开(公告)号:US12136250B2
公开(公告)日:2024-11-05
申请号:US17332734
申请日:2021-05-27
Applicant: Adobe Inc.
Inventor: Khoi Pham , Kushal Kafle , Zhe Lin , Zhihong Ding , Scott Cohen , Quan Tran
IPC: G06V10/75 , G06F18/214 , G06F18/25 , G06N3/08
Abstract: This disclosure describes one or more implementations of systems, non-transitory computer-readable media, and methods that extract multiple attributes from an object portrayed in a digital image utilizing a multi-attribute contrastive classification neural network. For example, the disclosed systems utilize a multi-attribute contrastive classification neural network that includes an embedding neural network, a localizer neural network, a multi-attention neural network, and a classifier neural network. In some cases, the disclosed systems train the multi-attribute contrastive classification neural network utilizing a multi-attribute, supervised-contrastive loss. In some embodiments, the disclosed systems generate negative attribute training labels for labeled digital images utilizing positive attribute labels that correspond to the labeled digital images.
-
公开(公告)号:US12136185B2
公开(公告)日:2024-11-05
申请号:US17455134
申请日:2021-11-16
Applicant: ADOBE INC.
Inventor: Jason Kuen , Jiuxiang Gu , Zhe Lin
IPC: G06T3/4046 , G06N3/045 , G06N3/08 , G06V10/75
Abstract: Systems and methods for image processing are described. The systems and methods include receiving a low-resolution image; generating a feature map based on the low-resolution image using an encoder of a student network, wherein the encoder of the student network is trained based on comparing a predicted feature map from the encoder of the student network and a fused feature map from a teacher network, and wherein the fused feature map represents a combination of first feature map from a high-resolution encoder of the teacher network and a second feature map from a low-resolution encoder of the teacher network; and decoding the feature map to obtain prediction information for the low-resolution image.
-
公开(公告)号:US20240355022A1
公开(公告)日:2024-10-24
申请号:US18476504
申请日:2023-09-28
Applicant: ADOBE INC.
Inventor: Jing Shi , Wei Xiong , Zhe Lin , Hyun Joon Jung
CPC classification number: G06T11/60 , G06T7/194 , G06T9/00 , G06T2207/20081
Abstract: One or more aspects of a method, apparatus, and non-transitory computer readable medium include obtaining an input description and an input image depicting a subject, encoding the input description using a text encoder of an image generation model to obtain a text embedding, and encoding the input image using a subject encoder of the image generation model to obtain a subject embedding. A guidance embedding is generated by combining the subject embedding and the text embedding, and then an output image is generated based on the guidance embedding using a diffusion model of the image generation model. The output image depicts aspects of the subject and the input description.
-
公开(公告)号:US20240338869A1
公开(公告)日:2024-10-10
申请号:US18474536
申请日:2023-09-26
Applicant: ADOBE INC.
Inventor: Yuqian Zhou , Krishna Kumar Singh , Zhifei Zhang , Difan Liu , Zhe Lin , Jianming Zhang , Qing Liu , Jingwan Lu , Elya Shechtman , Sohrab Amirghodsi , Connelly Stuart Barnes
IPC: G06T11/60
CPC classification number: G06T11/60
Abstract: An image processing system obtains an input image (e.g., a user provided image, etc.) and a mask indicating an edit region of the image. A user selects an image editing mode for an image generation network from a plurality of image editing modes. The image generation network generates an output image using the input image, the mask, and the image editing mode.
-
公开(公告)号:US12079269B2
公开(公告)日:2024-09-03
申请号:US18104848
申请日:2023-02-02
Applicant: Adobe Inc.
Inventor: Pranav Vineet Aggarwal , Zhe Lin , Baldo Antonio Faieta , Saeid Antonio Motiian
IPC: G06F16/00 , G06F16/538 , G06F16/583 , G06F18/21 , G06F18/22 , G06N3/08 , G06N20/00 , G06V10/82 , G06V10/94 , G06V30/19 , G06V30/262 , G06F40/30 , G10L15/22
CPC classification number: G06F16/583 , G06F16/538 , G06F18/21 , G06F18/22 , G06N3/08 , G06N20/00 , G06V10/82 , G06V10/945 , G06V30/19147 , G06V30/1916 , G06V30/19173 , G06V30/274 , G06F40/30 , G10L15/22
Abstract: Visually guided machine-learning language model and embedding techniques are described that overcome the challenges of conventional techniques in a variety of ways. In one example, a model is trained to support a visually guided machine-learning embedding space that supports visual intuition as to “what” is represented by text. The visually guided language embedding space supported by the model, once trained, may then be used to support visual intuition as part of a variety of functionality. In one such example, the visually guided language embedding space as implemented by the model may be leveraged as part of a multi-modal differential search to support search of digital images and other digital content with real-time focus adaptation which overcomes the challenges of conventional techniques.
-
309.
公开(公告)号:US12045963B2
公开(公告)日:2024-07-23
申请号:US18058630
申请日:2022-11-23
Applicant: Adobe Inc.
Inventor: Scott Cohen , Zhe Lin , Zhihong Ding , Luis Figueroa , Kushal Kafle
IPC: G06T5/77 , G06F3/04842 , G06F3/04845 , G06T3/20 , G06V10/70 , G06V10/86
CPC classification number: G06T5/77 , G06F3/04842 , G06F3/04845 , G06T3/20 , G06V10/768 , G06V10/86 , G06T2200/24 , G06T2207/20084 , G06T2207/20104
Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that modify digital images via scene-based editing using image understanding facilitated by artificial intelligence. For instance, in one or more embodiments, the disclosed systems detect, via a graphical user interface of a client device, a user selection of an object portrayed within a digital image. The disclosed systems determine, in response to detecting the user selection of the object, a relationship between the object and an additional object portrayed within the digital image. The disclosed systems receive one or more user interactions for modifying the object. The disclosed systems modify the digital image in response to the one or more user interactions by modifying the object and the additional object based on the relationship between the object and the additional object.
-
公开(公告)号:US12008739B2
公开(公告)日:2024-06-11
申请号:US17452529
申请日:2021-10-27
Applicant: ADOBE INC.
Inventor: Ning Xu , Zhe Lin , Franck Dernoncourt
CPC classification number: G06T5/77 , G06N3/08 , G06T5/50 , G06T5/90 , G06T11/60 , G10L15/22 , G06T2207/20081 , G10L2015/223
Abstract: The present disclosure relates to systems and methods for automatically processing images based on a user request. In some examples, a request is divided into a retouching command (e.g., a global edit) and an inpainting command (e.g., a local edit). A retouching mask and an inpainting mask are generated to indicate areas where the edits will be applied. A photo-request attention and a multi-modal modulation process are applied to features representing the image, and a modified image that incorporates the user's request is generated using the modified features.
-
-
-
-
-
-
-
-
-