-
公开(公告)号:US20230360180A1
公开(公告)日:2023-11-09
申请号:US17661985
申请日:2022-05-04
Applicant: Adobe Inc.
Inventor: Haitian Zheng , Zhe Lin , Jingwan Lu , Scott Cohen , Elya Shechtman , Connelly Barnes , Jianming Zhang , Ning Xu , Sohrab Amirghodsi
CPC classification number: G06T5/005 , G06T3/4046 , G06V10/40 , G06T2207/20084
Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media that generate inpainted digital images utilizing a cascaded modulation inpainting neural network. For example, the disclosed systems utilize a cascaded modulation inpainting neural network that includes cascaded modulation decoder layers. For example, in one or more decoder layers, the disclosed systems start with global code modulation that captures the global-range image structures followed by an additional modulation that refines the global predictions. Accordingly, in one or more implementations, the image inpainting system provides a mechanism to correct distorted local details. Furthermore, in one or more implementations, the image inpainting system leverages fast Fourier convolutions block within different resolution layers of the encoder architecture to expand the receptive field of the encoder and to allow the network encoder to better capture global structure.
-
82.
公开(公告)号:US20230325996A1
公开(公告)日:2023-10-12
申请号:US18167690
申请日:2023-02-10
Applicant: Adobe Inc.
Inventor: Zhifei Zhang , Jianming Zhang , Scott Cohen , Zhe Lin
IPC: G06T5/50 , G06T3/40 , G06V10/60 , G06F3/04842
CPC classification number: G06T5/50 , G06T3/40 , G06V10/60 , G06F3/04842 , G06T2207/20101 , G06T2207/20104 , G06T2207/20221
Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media that generates composite images via auto-compositing features. For example, in one or more embodiments, the disclosed systems determine a background image and a foreground object image for use in generating a composite image. The disclosed systems further provide, for display within a graphical user interface of a client device, at least one selectable option for executing an auto-composite model for the composite image, the auto-composite model comprising at least one of a scale prediction model, a harmonization model, or a shadow generation model. The disclosed systems detect, via the graphical user interface, a user selection of the at least one selectable option and generate, in response to detecting the user selection, the composite image by executing the auto-composite model using the background image and the foreground object image.
-
83.
公开(公告)号:US20230325991A1
公开(公告)日:2023-10-12
申请号:US17658770
申请日:2022-04-11
Applicant: Adobe Inc.
Inventor: Zhe Lin , Sijie Zhu , Jason Wen Yong Kuen , Scott Cohen , Zhifei Zhang
CPC classification number: G06T5/50 , G06T7/194 , G06T5/002 , G06T3/60 , G06T2207/20084 , G06T2207/20221
Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media that utilizes artificial intelligence to learn to recommend foreground object images for use in generating composite images based on geometry and/or lighting features. For instance, in one or more embodiments, the disclosed systems transform a foreground object image corresponding to a background image using at least one of a geometry transformation or a lighting transformation. The disclosed systems further generating predicted embeddings for the background image, the foreground object image, and the transformed foreground object image within a geometry-lighting-sensitive embedding space utilizing a geometry-lighting-aware neural network. Using a loss determined from the predicted embeddings, the disclosed systems update parameters of the geometry-lighting-aware neural network. The disclosed systems further provide a variety of efficient user interfaces for generating composite digital images.
-
公开(公告)号:US20230298148A1
公开(公告)日:2023-09-21
申请号:US17655663
申请日:2022-03-21
Applicant: Adobe Inc.
Inventor: He Zhang , Jianming Zhang , Jose Ignacio Echevarria Vallespi , Kalyan Sunkavalli , Meredith Payne Stotzner , Yinglan Ma , Zhe Lin , Elya Shechtman , Frederick Mandia
CPC classification number: G06T5/50 , G06T7/194 , G06T7/90 , G06T11/001 , G06T2207/20084 , G06T2207/20212 , G06T2200/24 , G06T2207/20092 , G06T2207/20016 , G06T2207/20081 , G06T2207/30168
Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods that implement a dual-branched neural network architecture to harmonize composite images. For example, in one or more implementations, the transformer-based harmonization system uses a convolutional branch and a transformer branch to generate a harmonized composite image based on an input composite image and a corresponding segmentation mask. More particularly, the convolutional branch comprises a series of convolutional neural network layers followed by a style normalization layer to extract localized information from the input composite image. Further, the transformer branch comprises a series of transformer neural network layers to extract global information based on different resolutions of the input composite image. Utilizing a decoder, the transformer-based harmonization system combines the local information and the global information from the corresponding convolutional branch and transformer branch to generate a harmonized composite image.
-
公开(公告)号:US11758082B2
公开(公告)日:2023-09-12
申请号:US17526853
申请日:2021-11-15
Applicant: Adobe Inc.
Inventor: Lu Zhang , Jianming Zhang , Zhe Lin , Radomir Meeh
IPC: H04N5/262 , G11B27/031 , G06V20/40 , G06V10/20 , G06V40/18
CPC classification number: H04N5/2628 , G06V10/255 , G06V20/40 , G06V20/41 , G11B27/031 , G06V40/193
Abstract: Systems and methods provide reframing operations in a smart editing system that may generate a focal point within a mask of an object for each frame of a video segment and perform editing effects on the frames of the video segment to quickly provide users with natural video editing effects. A reframing engine may processes video clips using a segmentation and hotspot module to determine a salient region of an object, generate a mask of the object, and track the trajectory of an object in the video clips. The reframing engine may then receive reframing parameters from a crop suggestion module and a user interface. Based on the determined trajectory of an object in a video clip and reframing parameters, the reframing engine may use reframing logic to produce temporally consistent reframing effects relative to an object for the video clip.
-
86.
公开(公告)号:US20230245266A1
公开(公告)日:2023-08-03
申请号:US18298630
申请日:2023-04-11
Applicant: Adobe Inc.
Inventor: Haitian Zheng , Zhe Lin , Jingwan Lu , Scott Cohen , Jianming Zhang , Ning Su
CPC classification number: G06T3/0093 , G06T9/002 , G06T11/00 , G06V10/46 , G06V30/2504 , G06F18/213 , G06T2210/36
Abstract: This disclosure describes one or more implementations of a digital image semantic layout manipulation system that generates refined digital images resembling the style of one or more input images while following the structure of an edited semantic layout. For example, in various implementations, the digital image semantic layout manipulation system builds and utilizes a sparse attention warped image neural network to generate high-resolution warped images and a digital image layout neural network to enhance and refine the high-resolution warped digital image into a realistic and accurate refined digital image.
-
公开(公告)号:US11636570B2
公开(公告)日:2023-04-25
申请号:US17220543
申请日:2021-04-01
Applicant: Adobe Inc.
Inventor: Haitian Zheng , Zhe Lin , Jingwan Lu , Scott Cohen , Jianming Zhang , Ning Xu
Abstract: This disclosure describes one or more implementations of a digital image semantic layout manipulation system that generates refined digital images resembling the style of one or more input images while following the structure of an edited semantic layout. For example, in various implementations, the digital image semantic layout manipulation system builds and utilizes a sparse attention warped image neural network to generate high-resolution warped images and a digital image layout neural network to enhance and refine the high-resolution warped digital image into a realistic and accurate refined digital image.
-
公开(公告)号:US11636270B2
公开(公告)日:2023-04-25
申请号:US16775697
申请日:2020-01-29
Applicant: ADOBE INC.
Inventor: Zhe Lin , Walter W. Chang , Scott Cohen , Khoi Viet Pham , Jonathan Brandt , Franck Dernoncourt
IPC: G06F40/30 , G06F16/532 , G06F16/55 , G06F40/205 , G06F40/295 , G06N5/02 , G06N5/04 , G06N20/00 , G06V10/40 , G06V30/10
Abstract: Embodiments of the present invention provide systems, methods, and non-transitory computer storage media for parsing a given input referring expression into a parse structure and generating a semantic computation graph to identify semantic relationships among and between objects. At a high level, when embodiments of the preset invention receive a referring expression, a parse tree is created and mapped into a hierarchical subject, predicate, object graph structure that labeled noun objects in the referring expression, the attributes of the labeled noun objects, and predicate relationships (e.g., verb actions or spatial propositions) between the labeled objects. Embodiments of the present invention then transform the subject, predicate, object graph structure into a semantic computation graph that may be recursively traversed and interpreted to determine how noun objects, their attributes and modifiers, and interrelationships are provided to downstream image editing, searching, or caption indexing tasks.
-
公开(公告)号:US11615567B2
公开(公告)日:2023-03-28
申请号:US16952008
申请日:2020-11-18
Applicant: Adobe Inc.
Inventor: Midhun Harikumar , Pranav Aggarwal , Baldo Faieta , Ajinkya Kale , Zhe Lin
Abstract: A non-transitory computer-readable medium includes program code that is stored thereon. The program code is executable by one or more processing devices for performing operations including generating, by a model that includes trainable components, a learned image representation of a target image. The operations further include generating, by a text embedding model, a text embedding of a text query. The text embedding and the learned image representation of the target image are in a same embedding space. Additionally, the operations include generating a class activation map of the target image by, at least, convolving the learned image representation of the target image with the text embedding of the text query. Moreover, the operations include generating an object-segmented image using the class activation map of the target image.
-
公开(公告)号:US11605168B2
公开(公告)日:2023-03-14
申请号:US17215067
申请日:2021-03-29
Applicant: Adobe Inc.
Inventor: Mingyang Ling , Alex Filipkowski , Zhe Lin , Jianming Zhang , Samarth Gulati
IPC: G06K9/62 , G06T7/11 , G06T7/136 , G06T7/143 , G06T7/174 , G06F18/214 , G06N3/045 , G06V10/25 , G06V10/764 , G06V10/82 , G06V10/26
Abstract: Techniques are disclosed for characterizing and defining the location of a copy space in an image. A methodology implementing the techniques according to an embodiment includes applying a regression convolutional neural network (CNN) to an image. The regression CNN is configured to predict properties of the copy space such as size and type (natural or manufactured). The prediction is conditioned on a determination of the presence of the copy space in the image. The method further includes applying a segmentation CNN to the image. The segmentation CNN is configured to generate one or more pixel-level masks to define the location of copy spaces in the image, whether natural or manufactured, or to define the location of a background region of the image. The segmentation CNN may include a first stage comprising convolutional layers and a second stage comprising pairs of boundary refinement layers and bilinear up-sampling layers.
-
-
-
-
-
-
-
-
-