FACIAL EXPRESSION AND POSE TRANSFER UTILIZING AN END-TO-END MACHINE LEARNING MODEL

    公开(公告)号:US20240331322A1

    公开(公告)日:2024-10-03

    申请号:US18190673

    申请日:2023-03-27

    申请人: Adobe Inc.

    发明人: Cameron Smith

    摘要: The present disclosure relates to systems, methods, and non-transitory computer-readable media that modify digital images via scene-based editing using image understanding facilitated by artificial intelligence. For example, in one or more embodiments the disclosed systems utilize generative machine learning models to create modified digital images portraying human subjects. In particular, the disclosed systems generate modified digital images by performing infill modifications to complete a digital image or human inpainting for portions of a digital image that portrays a human. Moreover, in some embodiments, the disclosed systems perform reposing of subjects portrayed within a digital image to generate modified digital images. In addition, the disclosed systems in some embodiments perform facial expression transfer and facial expression animations to generate modified digital images or animations.

    SUPERVISED LEARNING TECHNIQUES FOR ENCODER TRAINING

    公开(公告)号:US20220121932A1

    公开(公告)日:2022-04-21

    申请号:US17384378

    申请日:2021-07-23

    申请人: Adobe Inc.

    IPC分类号: G06N3/08 G06N3/04

    摘要: Systems and methods train an encoder neural network for fast and accurate projection into the latent space of a Generative Adversarial Network (GAN). The encoder is trained by providing an input training image to the encoder and producing, by the encoder, a latent space representation of the input training image. The latent space representation is provided as input to the GAN to generate a generated training image. A latent code is sampled from a latent space associated with the GAN and the sampled latent code is provided as input to the GAN. The GAN generates a synthetic training image based on the sampled latent code. The sampled latent code is provided as input to the encoder to produce a synthetic training code. The encoder is updated by minimizing a loss between the generated training image and the input training image, and the synthetic training code and the sampled latent code.

    Image editing by a generative adversarial network using keypoints or segmentation masks constraints

    公开(公告)号:US11157773B2

    公开(公告)日:2021-10-26

    申请号:US16802243

    申请日:2020-02-26

    申请人: ADOBE INC.

    IPC分类号: G06K9/62 G06K9/46 G06K9/00

    摘要: Images can be edited to include features similar to a different target image. An unconditional generative adversarial network (GAN) is employed to edit features of an initial image based on a constraint determined from a target image. The constraint used by the GAN is determined from keypoints or segmentation masks of the target image, and edits are made to features of the initial image based on keypoints or segmentation masks of the initial image corresponding to those of the constraint from the target image. The GAN modifies the initial image based on a loss function having a variable for the constraint. The result of this optimization process is a modified initial image having features similar to the target image subject to the constraint determined from the identified keypoints or segmentation masks.

    ATTRIBUTE DECORRELATION TECHNIQUES FOR IMAGE EDITING

    公开(公告)号:US20220122232A1

    公开(公告)日:2022-04-21

    申请号:US17468476

    申请日:2021-09-07

    申请人: Adobe Inc.

    IPC分类号: G06T5/00 G06T5/20 G06N3/08

    摘要: Systems and methods generate a filtering function for editing an image with reduced attribute correlation. An image editing system groups training data into bins according to a distribution of a target attribute. For each bin, the system samples a subset of the training data based on a pre-determined target distribution of a set of additional attributes in the training data. The system identifies a direction in the sampled training data corresponding to the distribution of the target attribute to generate a filtering vector for modifying the target attribute in an input image, obtains a latent space representation of an input image, applies the filtering vector to the latent space representation of the input image to generate a filtered latent space representation of the input image, and provides the filtered latent space representation as input to a neural network to generate an output image with a modification to the target attribute.