Compressing generative adversarial neural networks

    公开(公告)号:US11934958B2

    公开(公告)日:2024-03-19

    申请号:US17147912

    申请日:2021-01-13

    申请人: Adobe Inc.

    摘要: This disclosure describes one or more embodiments of systems, non-transitory computer-readable media, and methods that utilize channel pruning and knowledge distillation to generate a compact noise-to-image GAN. For example, the disclosed systems prune less informative channels via outgoing channel weights of the GAN. In some implementations, the disclosed systems further utilize content-aware pruning by utilizing a differentiable loss between an image generated by the GAN and a modified version of the image to identify sensitive channels within the GAN during channel pruning. In some embodiments, the disclosed systems utilize knowledge distillation to learn parameters for the pruned GAN to mimic a full-size GAN. In certain implementations, the disclosed systems utilize content-aware knowledge distillation by applying content masks on images generated by both the pruned GAN and its full-size counterpart to obtain knowledge distillation losses between the images for use in learning the parameters for the pruned GAN.

    POINT-BASED NEURAL RADIANCE FIELD FOR THREE DIMENSIONAL SCENE REPRESENTATION

    公开(公告)号:US20240013477A1

    公开(公告)日:2024-01-11

    申请号:US17861199

    申请日:2022-07-09

    申请人: Adobe Inc.

    IPC分类号: G06T15/20 G06T15/80 G06T15/06

    摘要: A scene modeling system receives a plurality of input two-dimensional (2D) images corresponding to a plurality of views of an object and a request to display a three-dimensional (3D) scene that includes the object. The scene modeling system generates an output 2D image for a view of the 3D scene by applying a scene representation model to the input 2D images. The scene representation model includes a point cloud generation model configured to generate, based on the input 2D images, a neural point cloud representing the 3D scene. The scene representation model includes a neural point volume rendering model configured to determine, for each pixel of the output image and using the neural point cloud and a volume rendering process, a color value. The scene modeling system transmits, responsive to the request, the output 2D image. Each pixel of the output image includes the respective determined color value.

    CONTROLLABLE DYNAMIC APPEARANCE FOR NEURAL 3D PORTRAITS

    公开(公告)号:US20240338915A1

    公开(公告)日:2024-10-10

    申请号:US18132272

    申请日:2023-04-07

    申请人: Adobe Inc.

    摘要: Certain aspects and features of this disclosure relate to providing a controllable, dynamic appearance for neural 3D portraits. For example, a method involves projecting a color at points in a digital video portrait based on location, surface normal, and viewing direction for each respective point in a canonical space. The method also involves projecting, using the color, dynamic face normals for the points as changing according to an articulated head pose and facial expression in the digital video portrait. The method further involves disentangling, based on the dynamic face normals, a facial appearance in the digital video portrait into intrinsic components in the canonical space. The method additionally involves storing and/or rendering at least a portion of a head pose as a controllable, neural 3D portrait based on the digital video portrait using the intrinsic components.

    SUPERVISED LEARNING TECHNIQUES FOR ENCODER TRAINING

    公开(公告)号:US20220121932A1

    公开(公告)日:2022-04-21

    申请号:US17384378

    申请日:2021-07-23

    申请人: Adobe Inc.

    IPC分类号: G06N3/08 G06N3/04

    摘要: Systems and methods train an encoder neural network for fast and accurate projection into the latent space of a Generative Adversarial Network (GAN). The encoder is trained by providing an input training image to the encoder and producing, by the encoder, a latent space representation of the input training image. The latent space representation is provided as input to the GAN to generate a generated training image. A latent code is sampled from a latent space associated with the GAN and the sampled latent code is provided as input to the GAN. The GAN generates a synthetic training image based on the sampled latent code. The sampled latent code is provided as input to the encoder to produce a synthetic training code. The encoder is updated by minimizing a loss between the generated training image and the input training image, and the synthetic training code and the sampled latent code.