-
公开(公告)号:US20240320873A1
公开(公告)日:2024-09-26
申请号:US18439036
申请日:2024-02-12
Applicant: ADOBE INC.
Inventor: Tobias Hinz , Ali Aminian , Hao Tan , Kushal Kafle , Oliver Wang , Jingwan Lu
IPC: G06T11/00
CPC classification number: G06T11/00 , G06T2211/441
Abstract: A method, apparatus, non-transitory computer readable medium, and system for image generation include obtaining a text prompt and encoding, using a text encoder jointly trained with an image generation model, the text prompt to obtain a text embedding. Some embodiments generate, using the image generation model, a synthetic image based on the text embedding.
-
公开(公告)号:US11854206B2
公开(公告)日:2023-12-26
申请号:US17735156
申请日:2022-05-03
Applicant: Adobe Inc.
Inventor: Federico Perazzi , Zhe Lin , Ping Hu , Oliver Wang , Fabian David Caba Heilbron
CPC classification number: G06T7/11 , G06F17/15 , G06N3/045 , G06V10/806 , G06V20/46 , G06V20/49 , G06T2207/10016 , G06T2207/20084
Abstract: A Video Semantic Segmentation System (VSSS) is disclosed that performs accurate and fast semantic segmentation of videos using a set of temporally distributed neural networks. The VSSS receives as input a video signal comprising a contiguous sequence of temporally-related video frames. The VSSS extracts features from the video frames in the contiguous sequence and based upon the extracted features, selects, from a set of labels, a label to be associated with each pixel of each video frame in the video signal. In certain embodiments, a set of multiple neural networks are used to extract the features to be used for video segmentation and the extraction of features is distributed among the multiple neural networks in the set. A strong feature representation representing the entirety of the features is produced for each video frame in the sequence of video frames by aggregating the output features extracted by the multiple neural networks.
-
公开(公告)号:US20230102055A1
公开(公告)日:2023-03-30
申请号:US18058163
申请日:2022-11-22
Applicant: Adobe Inc.
Inventor: Taesung Park , Richard Zhang , Oliver Wang , Junyan Zhu , Jingwan Lu , Elya Shechtman , Alexei A. Efros
Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media for generating a modified digital image from extracted spatial and global codes. For example, the disclosed systems can utilize a global and spatial autoencoder to extract spatial codes and global codes from digital images. The disclosed systems can further utilize the global and spatial autoencoder to generate a modified digital image by combining extracted spatial and global codes in various ways for various applications such as style swapping, style blending, and attribute editing.
-
4.
公开(公告)号:US11539932B2
公开(公告)日:2022-12-27
申请号:US17519332
申请日:2021-11-04
Applicant: Adobe Inc.
Inventor: Stephen DiVerdi , Seth Walker , Oliver Wang , Cuong Nguyen
IPC: H04N13/111 , H04N13/282 , H04N13/383
Abstract: This disclosure relates to methods, non-transitory computer readable media, and systems that generate and dynamically change filter parameters for a frame of a 360-degree video based on detecting a field of view from a computing device. As a computing device rotates or otherwise changes orientation, for instance, the disclosed systems can detect a field of view and interpolate one or more filter parameters corresponding to nearby spatial keyframes of the 360-degree video to generate view-specific-filter parameters. By generating and storing filter parameters for spatial keyframes corresponding to different times and different view directions, the disclosed systems can dynamically adjust color grading or other visual effects using interpolated, view-specific-filter parameters to render a filtered version of the 360-degree video.
-
公开(公告)号:US11443481B1
公开(公告)日:2022-09-13
申请号:US17186522
申请日:2021-02-26
Applicant: Adobe Inc.
Inventor: Wei Yin , Jianming Zhang , Oliver Wang , Simon Niklaus , Mai Long , Su Chen
Abstract: This disclosure describes implementations of a three-dimensional (3D) scene recovery system that reconstructs a 3D scene representation of a scene portrayed in a single digital image. For instance, the 3D scene recovery system trains and utilizes a 3D point cloud model to recover accurate intrinsic camera parameters from a depth map of the digital image. Additionally, the 3D point cloud model may include multiple neural networks that target specific intrinsic camera parameters. For example, the 3D point cloud model may include a depth 3D point cloud neural network that recovers the depth shift as well as include a focal length 3D point cloud neural network that recovers the camera focal length. Further, the 3D scene recovery system may utilize the recovered intrinsic camera parameters to transform the single digital image into an accurate and realistic 3D scene representation, such as a 3D point cloud.
-
公开(公告)号:US20220270370A1
公开(公告)日:2022-08-25
申请号:US17735156
申请日:2022-05-03
Applicant: Adobe Inc.
Inventor: Federico Perazzi , Zhe Lin , Ping Hu , Oliver Wang , Fabian David Caba Heilbron
Abstract: A Video Semantic Segmentation System (VSSS) is disclosed that performs accurate and fast semantic segmentation of videos using a set of temporally distributed neural networks. The VSSS receives as input a video signal comprising a contiguous sequence of temporally-related video frames. The VSSS extracts features from the video frames in the contiguous sequence and based upon the extracted features, selects, from a set of labels, a label to be associated with each pixel of each video frame in the video signal. In certain embodiments, a set of multiple neural networks are used to extract the features to be used for video segmentation and the extraction of features is distributed among the multiple neural networks in the set. A strong feature representation representing the entirety of the features is produced for each video frame in the sequence of video frames by aggregating the output features extracted by the multiple neural networks.
-
公开(公告)号:US11367206B2
公开(公告)日:2022-06-21
申请号:US16790056
申请日:2020-02-13
Applicant: Adobe Inc.
Inventor: Zhe Lin , Oliver Wang , Mai Long , Ke Xian , Jianming Zhang
Abstract: In order to provide monocular depth prediction, a trained neural network may be used. To train the neural network, edge detection on a digital image may be performed to determine at least one edge of the digital image, and then a first point and a second point of the digital image may be sampled, based on the at least one edge. A relative depth between the first point and the second point may be predicted, and the neural network may be trained to perform monocular depth prediction using a loss function that compares the predicted relative depth with a ground truth relative depth between the first point and the second point.
-
公开(公告)号:US20220122305A1
公开(公告)日:2022-04-21
申请号:US17384273
申请日:2021-07-23
Applicant: Adobe Inc.
Inventor: Cameron Smith , Ratheesh Kalarot , Wei-An Lin , Richard Zhang , Niloy Mitra , Elya Shechtman , Shabnam Ghadar , Zhixin Shu , Yannick Hold-Geoffrey , Nathan Carr , Jingwan Lu , Oliver Wang , Jun-Yan Zhu
Abstract: An improved system architecture uses a pipeline including an encoder and a Generative Adversarial Network (GAN) including a generator neural network to generate edited images with improved speed, realism, and identity preservation. The encoder produces an initial latent space representation of an input image by encoding the input image. The generator neural network generates an initial output image by processing the initial latent space representation of the input image. The system generates an optimized latent space representation of the input image using a loss minimization technique that minimizes a loss between the input image and the initial output image. The loss is based on target perceptual features extracted from the input image and initial perceptual features extracted from the initial output image. The system outputs the optimized latent space representation of the input image for downstream use.
-
公开(公告)号:US20220122222A1
公开(公告)日:2022-04-21
申请号:US17384283
申请日:2021-07-23
Applicant: Adobe Inc.
Inventor: Cameron Smith , Ratheesh Kalarot , Wei-An Lin , Richard Zhang , Niloy Mitra , Elya Shechtman , Shabnam Ghadar , Zhixin Shu , Yannick Hold-Geoffrey , Nathan Carr , Jingwan Lu , Oliver Wang , Jun-Yan Zhu
Abstract: An improved system architecture uses a Generative Adversarial Network (GAN) including a specialized generator neural network to generate multiple resolution output images. The system produces a latent space representation of an input image. The system generates a first output image at a first resolution by providing the latent space representation of the input image as input to a generator neural network comprising an input layer, an output layer, and a plurality of intermediate layers and taking the first output image from an intermediate layer, of the plurality of intermediate layers of the generator neural network. The system generates a second output image at a second resolution different from the first resolution by providing the latent space representation of the input image as input to the generator neural network and taking the second output image from the output layer of the generator neural network.
-
公开(公告)号:US20210287007A1
公开(公告)日:2021-09-16
申请号:US16817100
申请日:2020-03-12
Applicant: Adobe Inc.
Inventor: Oliver Wang , Matthew Fisher , John Nelson , Geoffrey Oxholm , Elya Shechtman , Wenqi Xian
Abstract: Certain aspects involve video inpainting in which content is propagated from a user-provided reference frame to other video frames depicting a scene. For example, a computing system accesses a set of video frames with annotations identifying a target region to be modified. The computing system determines a motion of the target region's boundary across the set of video frames, and also interpolates pixel motion within the target region across the set of video frames. The computing system also inserts, responsive to user input, a reference frame into the set of video frames. The reference frame can include reference color data from a user-specified modification to the target region. The computing system can use the reference color data and the interpolated motion to update color data in the target region across set of video frames.
-
-
-
-
-
-
-
-
-