AUTODECODING LATENT 3D DIFFUSION MODELS

    公开(公告)号:US20240420407A1

    公开(公告)日:2024-12-19

    申请号:US18211149

    申请日:2023-06-16

    Applicant: Snap Inc.

    Abstract: Systems and methods for generating static and articulated 3D assets are provided that include a 3D autodecoder at their core. The 3D autodecoder framework embeds properties learned from the target dataset in the latent space, which can then be decoded into a volumetric representation for rendering view-consistent appearance and geometry. The appropriate intermediate volumetric latent space is then identified and robust normalization and de-normalization operations are implemented to learn a 3D diffusion from 2D images or monocular videos of rigid or articulated objects. The methods are flexible enough to use either existing camera supervision or no camera information at all—instead efficiently learning the camera information during training. The generated results are shown to outperform state-of-the-art alternatives on various benchmark datasets and metrics, including multi-view image datasets of synthetic objects, real in-the-wild videos of moving people, and a large-scale, real video dataset of static objects.

    3D MODELING BASED ON NEURAL LIGHT FIELD
    4.
    发明公开

    公开(公告)号:US20240273809A1

    公开(公告)日:2024-08-15

    申请号:US18644653

    申请日:2024-04-24

    Applicant: Snap Inc.

    CPC classification number: G06T15/06 G06T7/97 G06T2207/20081 G06T2207/20084

    Abstract: Methods and systems are disclosed for performing operations for generating a 3D model of a scene. The operations include: receiving a set of two-dimensional (2D) images representing a first view of a real-world environment; applying a machine learning model comprising a neural light field network to the set of 2D images to predict pixel values of a target image representing a second view of the real-world environment, the machine learning model being trained to map a ray origin and direction directly to a given pixel value; and generating a three-dimensional (3D) model of the real-world environment based on the set of 2D images and the predicted target image.

    3D modeling based on neural light field

    公开(公告)号:US12002146B2

    公开(公告)日:2024-06-04

    申请号:US17656778

    申请日:2022-03-28

    Applicant: Snap Inc.

    CPC classification number: G06T15/06 G06T7/97 G06T2207/20081 G06T2207/20084

    Abstract: Methods and systems are disclosed for performing operations for generating a 3D model of a scene. The operations include: receiving a set of two-dimensional (2D) images representing a first view of a real-world environment; applying a machine learning model comprising a neural light field network to the set of 2D images to predict pixel values of a target image representing a second view of the real-world environment, the machine learning model being trained to map a ray origin and direction directly to a given pixel value; and generating a three-dimensional (3D) model of the real-world environment based on the set of 2D images and the predicted target image.

    CROSS-MODAL SHAPE AND COLOR MANIPULATION
    6.
    发明公开

    公开(公告)号:US20230386158A1

    公开(公告)日:2023-11-30

    申请号:US17814391

    申请日:2022-07-22

    Applicant: Snap Inc.

    CPC classification number: G06T19/20 G06T17/00 G06T2219/2012 G06T2219/2021

    Abstract: Systems, computer readable media, and methods herein describe an editing system where a three-dimensional (3D) object can be edited by editing a 2D sketch or 2D RGB views of the 3D object. The editing system uses multi-modal (MM) variational auto-decoders (VADs)(MM-VADs) that are trained with a shared latent space that enables editing 3D objects by editing 2D sketches of the 3D objects. The system determines a latent code that corresponds to an edited or sketched 2D sketch. The latent code is then used to generate a 3D object using the MM-VADs with the latent code as input. The latent space is divided into a latent space for shapes and a latent space for colors. The MM-VADs are trained with variational auto-encoders (VAE) and a ground truth.

    3D MODELING BASED ON NEURAL LIGHT FIELD
    8.
    发明公开

    公开(公告)号:US20230306675A1

    公开(公告)日:2023-09-28

    申请号:US17656778

    申请日:2022-03-28

    Applicant: Snap Inc.

    CPC classification number: G06T15/06 G06T7/97 G06T2207/20081 G06T2207/20084

    Abstract: Methods and systems are disclosed for performing operations for generating a 3D model of a scene. The operations include: receiving a set of two-dimensional (2D) images representing a first view of a real-world environment; applying a machine learning model comprising a neural light field network to the set of 2D images to predict pixel values of a target image representing a second view of the real-world environment, the machine learning model being trained to map a ray origin and direction directly to a given pixel value; and generating a three-dimensional (3D) model of the real-world environment based on the set of 2D images and the predicted target image.

    FLOW-GUIDED MOTION RETARGETING
    9.
    发明申请

    公开(公告)号:US20220207786A1

    公开(公告)日:2022-06-30

    申请号:US17557834

    申请日:2021-12-21

    Applicant: Snap Inc.

    Abstract: Systems and methods herein describe a motion retargeting system. The motion retargeting system accesses a plurality of two-dimensional images comprising a person performing a plurality of body poses, extracts a plurality of implicit volumetric representations from the plurality of body poses, generates a three-dimensional warping field, the three-dimensional warping field configured to warp the plurality of implicit volumetric representations from a canonical pose to a target pose, and based on the three-dimensional warping field, generates a two-dimensional image of an artificial person performing the target pose.

Patent Agency Ranking