-
公开(公告)号:US11880766B2
公开(公告)日:2024-01-23
申请号:US17384357
申请日:2021-07-23
申请人: Adobe Inc.
发明人: Cameron Smith , Ratheesh Kalarot , Wei-An Lin , Richard Zhang , Niloy Mitra , Elya Shechtman , Shabnam Ghadar , Zhixin Shu , Yannick Hold-Geoffrey , Nathan Carr , Jingwan Lu , Oliver Wang , Jun-Yan Zhu
IPC分类号: G06N3/08 , G06F3/04845 , G06F3/04847 , G06T11/60 , G06T3/40 , G06N20/20 , G06T5/00 , G06T5/20 , G06T3/00 , G06T11/00 , G06F18/40 , G06F18/211 , G06F18/214 , G06F18/21 , G06N3/045
CPC分类号: G06N3/08 , G06F3/04845 , G06F3/04847 , G06F18/211 , G06F18/214 , G06F18/2163 , G06F18/40 , G06N3/045 , G06N20/20 , G06T3/0006 , G06T3/0093 , G06T3/40 , G06T3/4038 , G06T3/4046 , G06T5/005 , G06T5/20 , G06T11/001 , G06T11/60 , G06T2207/10024 , G06T2207/20081 , G06T2207/20084 , G06T2207/20221 , G06T2210/22
摘要: An improved system architecture uses a pipeline including a Generative Adversarial Network (GAN) including a generator neural network and a discriminator neural network to generate an image. An input image in a first domain and information about a target domain are obtained. The domains correspond to image styles. An initial latent space representation of the input image is produced by encoding the input image. An initial output image is generated by processing the initial latent space representation with the generator neural network. Using the discriminator neural network, a score is computed indicating whether the initial output image is in the target domain. A loss is computed based on the computed score. The loss is minimized to compute an updated latent space representation. The updated latent space representation is processed with the generator neural network to generate an output image in the target domain.
-
公开(公告)号:US20220122232A1
公开(公告)日:2022-04-21
申请号:US17468476
申请日:2021-09-07
申请人: Adobe Inc.
发明人: Wei-An Lin , Baldo Faieta , Cameron Smith , Elya Shechtman , Jingwan Lu , Jun-Yan Zhu , Niloy Mitra , Ratheesh Kalarot , Richard Zhang , Shabnam Ghadar , Zhixin Shu
摘要: Systems and methods generate a filtering function for editing an image with reduced attribute correlation. An image editing system groups training data into bins according to a distribution of a target attribute. For each bin, the system samples a subset of the training data based on a pre-determined target distribution of a set of additional attributes in the training data. The system identifies a direction in the sampled training data corresponding to the distribution of the target attribute to generate a filtering vector for modifying the target attribute in an input image, obtains a latent space representation of an input image, applies the filtering vector to the latent space representation of the input image to generate a filtered latent space representation of the input image, and provides the filtered latent space representation as input to a neural network to generate an output image with a modification to the target attribute.
-
3.
公开(公告)号:US20240296607A1
公开(公告)日:2024-09-05
申请号:US18178167
申请日:2023-03-03
申请人: Adobe Inc.
发明人: Yijun Li , Richard Zhang , Krishna Kumar Singh , Jingwan Lu , Gaurav Parmar , Jun-Yan Zhu
CPC分类号: G06T11/60 , G06F40/56 , G06T1/0021 , G06T5/70 , G06V10/44 , G06V10/82 , G06V20/70 , G06T2207/20182
摘要: The present disclosure relates to systems, non-transitory computer-readable media, and methods for utilizing machine learning models to generate modified digital images. In particular, in some embodiments, the disclosed systems generate image editing directions between textual identifiers of two visual features utilizing a language prediction machine learning model and a text encoder. In some embodiments, the disclosed systems generated an inversion of a digital image utilizing a regularized inversion model to guide forward diffusion of the digital image. In some embodiments, the disclosed systems utilize cross-attention guidance to preserve structural details of a source digital image when generating a modified digital image with a diffusion neural network.
-
公开(公告)号:US11983628B2
公开(公告)日:2024-05-14
申请号:US17468487
申请日:2021-09-07
申请人: Adobe Inc.
发明人: Wei-An Lin , Baldo Faieta , Cameron Smith , Elya Shechtman , Jingwan Lu , Jun-Yan Zhu , Niloy Mitra , Ratheesh Kalarot , Richard Zhang , Shabnam Ghadar , Zhixin Shu
IPC分类号: G06N3/08 , G06F3/04845 , G06F3/04847 , G06F18/21 , G06F18/211 , G06F18/214 , G06F18/40 , G06N3/045 , G06N20/20 , G06T3/02 , G06T3/18 , G06T3/40 , G06T3/4038 , G06T3/4046 , G06T5/20 , G06T5/77 , G06T11/00 , G06T11/60
CPC分类号: G06N3/08 , G06F3/04845 , G06F3/04847 , G06F18/211 , G06F18/214 , G06F18/2163 , G06F18/40 , G06N3/045 , G06N20/20 , G06T3/02 , G06T3/18 , G06T3/40 , G06T3/4038 , G06T3/4046 , G06T5/20 , G06T5/77 , G06T11/001 , G06T11/60 , G06T2207/10024 , G06T2207/20081 , G06T2207/20084 , G06T2207/20221 , G06T2210/22
摘要: Systems and methods dynamically adjust an available range for editing an attribute in an image. An image editing system computes a metric for an attribute in an input image as a function of a latent space representation of the input image and a filtering vector for editing the input image. The image editing system compares the metric to a threshold. If the metric exceeds the threshold, then the image editing system selects a first range for editing the attribute in the input image. If the metric does not exceed the threshold, a second range is selected. The image editing system causes display of a user interface for editing the input image comprising an interface element for editing the attribute within the selected range.
-
公开(公告)号:US20220122306A1
公开(公告)日:2022-04-21
申请号:US17468487
申请日:2021-09-07
申请人: Adobe Inc.
发明人: Wei-An Lin , Baldo Faieta , Cameron Smith , Elya Shechtman , Jingwan Lu , Jun-Yan Zhu , Niloy Mitra , Ratheesh Kalarot , Richard Zhang , Shabnam Ghadar , Zhixin Shu
IPC分类号: G06T11/60 , G06F3/0484 , G06N3/08 , G06N3/04
摘要: Systems and methods dynamically adjust an available range for editing an attribute in an image. An image editing system computes a metric for an attribute in an input image as a function of a latent space representation of the input image and a filtering vector for editing the input image. The image editing system compares the metric to a threshold. If the metric exceeds the threshold, then the image editing system selects a first range for editing the attribute in the input image. If the metric does not exceed the threshold, a second range is selected. The image editing system causes display of a user interface for editing the input image comprising an interface element for editing the attribute within the selected range.
-
公开(公告)号:US20220122221A1
公开(公告)日:2022-04-21
申请号:US17384357
申请日:2021-07-23
申请人: Adobe Inc.
发明人: Cameron Smith , Ratheesh Kalarot , Wei-An Lin , Richard Zhang , Niloy Mitra , Elya Shechtman , Shabnam Ghadar , Zhixin Shu , Yannick Hold-Geoffrey , Nathan Carr , Jingwan Lu , Oliver Wang , Jun-Yan Zhu
IPC分类号: G06T3/40 , G06F3/0484 , G06N3/08 , G06N3/04
摘要: An improved system architecture uses a pipeline including a Generative Adversarial Network (GAN) including a generator neural network and a discriminator neural network to generate an image. An input image in a first domain and information about a target domain are obtained. The domains correspond to image styles. An initial latent space representation of the input image is produced by encoding the input image. An initial output image is generated by processing the initial latent space representation with the generator neural network. Using the discriminator neural network, a score is computed indicating whether the initial output image is in the target domain. A loss is computed based on the computed score. The loss is minimized to compute an updated latent space representation. The updated latent space representation is processed with the generator neural network to generate an output image in the target domain.
-
公开(公告)号:US20220121931A1
公开(公告)日:2022-04-21
申请号:US17384371
申请日:2021-07-23
申请人: Adobe Inc.
发明人: Ratheesh Kalarot , Wei-An Lin , Cameron Smith , Zhixin Shu , Baldo Faieta , Shabnam Ghadar , Jingwan Lu , Aliakbar Darabi , Jun-Yan Zhu , Niloy Mitra , Richard Zhang , Elya Shechtman
摘要: Systems and methods train and apply a specialized encoder neural network for fast and accurate projection into the latent space of a Generative Adversarial Network (GAN). The specialized encoder neural network includes an input layer, a feature extraction layer, and a bottleneck layer positioned after the feature extraction layer. The projection process includes providing an input image to the encoder and producing, by the encoder, a latent space representation of the input image. Producing the latent space representation includes extracting a feature vector from the feature extraction layer, providing the feature vector to the bottleneck layer as input, and producing the latent space representation as output. The latent space representation produced by the encoder is provided as input to the GAN, which generates an output image based upon the latent space representation. The encoder is trained using specialized loss functions including a segmentation loss and a mean latent loss.
-
公开(公告)号:US20240338799A1
公开(公告)日:2024-10-10
申请号:US18178212
申请日:2023-03-03
申请人: Adobe Inc.
发明人: Yijun Li , Richard Zhang , Krishna Kumar Singh , Jingwan Lu , Gaurav Parmar , Jun-Yan Zhu
IPC分类号: G06T5/00 , G06F40/126 , G06T5/50
CPC分类号: G06T5/70 , G06F40/126 , G06T5/50 , G06T2207/20081 , G06T2207/20084
摘要: The present disclosure relates to systems, non-transitory computer-readable media, and methods for utilizing machine learning models to generate modified digital images. In particular, in some embodiments, the disclosed systems generate image editing directions between textual identifiers of two visual features utilizing a language prediction machine learning model and a text encoder. In some embodiments, the disclosed systems generated an inversion of a digital image utilizing a regularized inversion model to guide forward diffusion of the digital image. In some embodiments, the disclosed systems utilize cross-attention guidance to preserve structural details of a source digital image when generating a modified digital image with a diffusion neural network.
-
9.
公开(公告)号:US20240331236A1
公开(公告)日:2024-10-03
申请号:US18178194
申请日:2023-03-03
申请人: Adobe Inc.
发明人: Yijun Li , Richard Zhang , Krishna Kumar Singh , Jingwan Lu , Gaurav Parmar , Jun-Yan Zhu
CPC分类号: G06T11/60 , G06T5/70 , G06T9/00 , G06V10/761 , G06V10/82 , G06V20/70 , G06T2207/20182
摘要: The present disclosure relates to systems, non-transitory computer-readable media, and methods for utilizing machine learning models to generate modified digital images. In particular, in some embodiments, the disclosed systems generate image editing directions between textual identifiers of two visual features utilizing a language prediction machine learning model and a text encoder. In some embodiments, the disclosed systems generated an inversion of a digital image utilizing a regularized inversion model to guide forward diffusion of the digital image. In some embodiments, the disclosed systems utilize cross-attention guidance to preserve structural details of a source digital image when generating a modified digital image with a diffusion neural network.
-
公开(公告)号:US11875221B2
公开(公告)日:2024-01-16
申请号:US17468476
申请日:2021-09-07
申请人: Adobe Inc.
发明人: Wei-An Lin , Baldo Faieta , Cameron Smith , Elya Shechtman , Jingwan Lu , Jun-Yan Zhu , Niloy Mitra , Ratheesh Kalarot , Richard Zhang , Shabnam Ghadar , Zhixin Shu
IPC分类号: G06N3/08 , G06F3/04845 , G06F3/04847 , G06T11/60 , G06T3/40 , G06N20/20 , G06T5/00 , G06T5/20 , G06T3/00 , G06T11/00 , G06F18/40 , G06F18/211 , G06F18/214 , G06F18/21 , G06N3/045
CPC分类号: G06N3/08 , G06F3/04845 , G06F3/04847 , G06F18/211 , G06F18/214 , G06F18/2163 , G06F18/40 , G06N3/045 , G06N20/20 , G06T3/0006 , G06T3/0093 , G06T3/40 , G06T3/4038 , G06T3/4046 , G06T5/005 , G06T5/20 , G06T11/001 , G06T11/60 , G06T2207/10024 , G06T2207/20081 , G06T2207/20084 , G06T2207/20221 , G06T2210/22
摘要: Systems and methods generate a filtering function for editing an image with reduced attribute correlation. An image editing system groups training data into bins according to a distribution of a target attribute. For each bin, the system samples a subset of the training data based on a pre-determined target distribution of a set of additional attributes in the training data. The system identifies a direction in the sampled training data corresponding to the distribution of the target attribute to generate a filtering vector for modifying the target attribute in an input image, obtains a latent space representation of an input image, applies the filtering vector to the latent space representation of the input image to generate a filtered latent space representation of the input image, and provides the filtered latent space representation as input to a neural network to generate an output image with a modification to the target attribute.
-
-
-
-
-
-
-
-
-