3D OBJECT RECONSTRUCTION USING PHOTOMETRIC MESH REPRESENTATION

    公开(公告)号:US20200372710A1

    公开(公告)日:2020-11-26

    申请号:US16985402

    申请日:2020-08-05

    Applicant: Adobe, Inc.

    Abstract: Techniques are disclosed for 3D object reconstruction using photometric mesh representations. A decoder is pretrained to transform points sampled from 2D patches of representative objects into 3D polygonal meshes. An image frame of the object is fed into an encoder to get an initial latent code vector. For each frame and camera pair from the sequence, a polygonal mesh is rendered at the given viewpoints. The mesh is optimized by creating a virtual viewpoint, rasterized to obtain a depth map. The 3D mesh projections are aligned by projecting the coordinates corresponding to the polygonal face vertices of the rasterized mesh to both selected viewpoints. The photometric error is determined from RGB pixel intensities sampled from both frames. Gradients from the photometric error are backpropagated into the vertices of the assigned polygonal indices by relating the barycentric coordinates of each image to update the latent code vector.

    Learning to Personalize Vision-Language Models through Meta-Personalization

    公开(公告)号:US20240419726A1

    公开(公告)日:2024-12-19

    申请号:US18210535

    申请日:2023-06-15

    Applicant: Adobe Inc.

    Abstract: Techniques for learning to personalize vision-language models through meta-personalization are described. In one embodiment, one or more processing devices lock a pre-trained vision-language model (VLM) during a training phase. The processing devices train the pre-trained VLM to augment a text encoder of the pre-trained VLM with a set of general named video instances to form a meta-personalized VLM, the meta-personalized VLM to include global category features. The processing devices test the meta-personalized VLM to adapt the text encoder with a set of personal named video instances to form a personal VLM, the personal VLM comprising the global category features personalized with a set of personal instance weights to form a personal instance token associated with the user. Other embodiments are described and claimed.

    Motion model refinement based on contact analysis and optimization

    公开(公告)号:US11721056B2

    公开(公告)日:2023-08-08

    申请号:US17573890

    申请日:2022-01-12

    Applicant: Adobe Inc.

    CPC classification number: G06T13/40 G06N3/08 G06T15/20 G06T15/50

    Abstract: In some embodiments, a model training system obtains a set of animation models. For each of the animation models, the model training system renders the animation model to generate a sequence of video frames containing a character using a set of rendering parameters and extracts joint points of the character from each frame of the sequence of video frames. The model training system further determines, for each frame of the sequence of video frames, whether a subset of the joint points are in contact with a ground plane in a three-dimensional space and generates contact labels for the subset of the joint points. The model training system trains a contact estimation model using training data containing the joint points extracted from the sequences of video frames and the generated contact labels. The contact estimation model can be used to refine a motion model for a character.

    GENERATING ACTION TAGS FOR DIGITAL VIDEOS

    公开(公告)号:US20210409836A1

    公开(公告)日:2021-12-30

    申请号:US17470441

    申请日:2021-09-09

    Applicant: Adobe Inc.

    Abstract: Systems, methods, and non-transitory computer-readable media are disclosed for automatic tagging of videos. In particular, in one or more embodiments, the disclosed systems generate a set of tagged feature vectors (e.g., tagged feature vectors based on action-rich digital videos) to utilize to generate tags for an input digital video. For instance, the disclosed systems can extract a set of frames for the input digital video and generate feature vectors from the set of frames. In some embodiments, the disclosed systems generate aggregated feature vectors from the feature vectors. Furthermore, the disclosed systems can utilize the feature vectors (or aggregated feature vectors) to identify similar tagged feature vectors from the set of tagged feature vectors. Additionally, the disclosed systems can generate a set of tags for the input digital videos by aggregating one or more tags corresponding to identified similar tagged feature vectors.

    MOTION MODEL REFINEMENT BASED ON CONTACT ANALYSIS AND OPTIMIZATION

    公开(公告)号:US20210335028A1

    公开(公告)日:2021-10-28

    申请号:US16860411

    申请日:2020-04-28

    Applicant: Adobe Inc.

    Abstract: In some embodiments, a motion model refinement system receives an input video depicting a human character and an initial motion model describing motions of individual joint points of the human character in a three-dimensional space. The motion model refinement system identifies foot joint points of the human character that are in contact with a ground plane using a trained contact estimation model. The motion model refinement system determines the ground plane based on the foot joint points and the initial motion model and constructs an optimization problem for refining the initial motion model. The optimization problem minimizes the difference between the refined motion model and the initial motion model under a set of plausibility constraints including constraints on the contact foot joint points and a time-dependent inertia tensor-based constraint. The motion model refinement system obtains the refined motion model by solving the optimization problem.

    3D object reconstruction using photometric mesh representation

    公开(公告)号:US10769848B1

    公开(公告)日:2020-09-08

    申请号:US16421729

    申请日:2019-05-24

    Applicant: Adobe, Inc.

    Abstract: Techniques are disclosed for 3D object reconstruction using photometric mesh representations. A decoder is pretrained to transform points sampled from 2D patches of representative objects into 3D polygonal meshes. An image frame of the object is fed into an encoder to get an initial latent code vector. For each frame and camera pair from the sequence, a polygonal mesh is rendered at the given viewpoints. The mesh is optimized by creating a virtual viewpoint, rasterized to obtain a depth map. The 3D mesh projections are aligned by projecting the coordinates corresponding to the polygonal face vertices of the rasterized mesh to both selected viewpoints. The photometric error is determined from RGB pixel intensities sampled from both frames. Gradients from the photometric error are backpropagated into the vertices of the assigned polygonal indices by relating the barycentric coordinates of each image to update the latent code vector.

    Planar region guided 3D geometry estimation from a single image

    公开(公告)号:US10290112B2

    公开(公告)日:2019-05-14

    申请号:US15996833

    申请日:2018-06-04

    Applicant: ADOBE INC.

    Abstract: Techniques for planar region-guided estimates of 3D geometry of objects depicted in a single 2D image. The techniques estimate regions of an image that are part of planar regions (i.e., flat surfaces) and use those planar region estimates to estimate the 3D geometry of the objects in the image. The planar regions and resulting 3D geometry are estimated using only a single 2D image of the objects. Training data from images of other objects is used to train a CNN with a model that is then used to make planar region estimates using a single 2D image. The planar region estimates, in one example, are based on estimates of planarity (surface plane information) and estimates of edges (depth discontinuities and edges between surface planes) that are estimated using models trained using images of other scenes.

Patent Agency Ranking