-
21.
公开(公告)号:US20250111592A1
公开(公告)日:2025-04-03
申请号:US18892186
申请日:2024-09-20
Applicant: NVIDIA Corporation
Inventor: Dejia Xu , Morteza Mardani , Jiaming Song , Sifei Liu , Ye Yuan , Arash Vahdat
IPC: G06T15/20 , G06V10/774 , G06V10/776 , G06V10/82
Abstract: Virtual reality and augmented reality bring increasing demand for 3D content creation. In an effort to automate the generation of 3D content, artificial intelligence-based processes have been developed. However, these processes are limited in terms of the quality of their output because they typically involve a model trained on limited 3D data thereby resulting in a model that does not generalize well to unseen objects, or a model trained on 2D data thereby resulting in a model that suffers from poor geometry due to ignorance of 3D information. The present disclosure jointly uses both 2D and 3D data to train a machine learning model to be able to generate 3D content from a single 2D image.
-
公开(公告)号:US20240070987A1
公开(公告)日:2024-02-29
申请号:US18110287
申请日:2023-02-15
Applicant: NVIDIA Corporation
Inventor: Xueting Li , Sifei Liu , Shalini De Mello , Orazio Gallo , Jiashun Wang , Jan Kautz
Abstract: Transferring pose to three-dimensional characters is a common computer graphics task that typically involves transferring the pose of a reference avatar to a (stylized) three-dimensional character. Since three-dimensional characters are created by professional artists through imagination and exaggeration, and therefore, unlike human or animal avatars, have distinct shape and features, matching the pose of a three-dimensional character to that of a reference avatar generally requires manually creating shape information for the three-dimensional character that is required for pose transfer. The present disclosure provides for the automated transfer of a reference pose to a three-dimensional character, based specifically on a learned shape code for the three-dimensional character.
-
公开(公告)号:US11880927B2
公开(公告)日:2024-01-23
申请号:US18320446
申请日:2023-05-19
Applicant: NVIDIA Corporation
Inventor: Xueting Li , Sifei Liu , Kihwan Kim , Shalini De Mello , Jan Kautz
CPC classification number: G06T15/04 , G06T7/579 , G06T7/70 , G06T15/20 , G06T17/20 , G06T2207/10016 , G06T2207/20084 , G06T2207/30244
Abstract: A three-dimensional (3D) object reconstruction neural network system learns to predict a 3D shape representation of an object from a video that includes the object. The 3D reconstruction technique may be used for content creation, such as generation of 3D characters for games, movies, and 3D printing. When 3D characters are generated from video, the content may also include motion of the character, as predicted based on the video. The 3D object construction technique exploits temporal consistency to reconstruct a dynamic 3D representation of the object from an unlabeled video. Specifically, an object in a video has a consistent shape and consistent texture across multiple frames. Texture, base shape, and part correspondence invariance constraints may be applied to fine-tune the neural network system. The reconstruction technique generalizes well—particularly for non-rigid objects.
-
公开(公告)号:US20230015989A1
公开(公告)日:2023-01-19
申请号:US17365877
申请日:2021-07-01
Applicant: Nvidia Corporation
Inventor: Zhiding Yu , Rui Huang , Wonmin Byeon , Sifei Liu , Guilin Liu , Thomas Breuel , Anima Anandkumar , Jan Kautz
Abstract: The disclosure provides a learning framework that unifies both semantic segmentation and semantic edge detection. A learnable recurrent message passing layer is disclosed where semantic edges are considered as explicitly learned gating signals to refine segmentation and improve dense prediction quality by finding compact structures for message paths. The disclosure includes a method for coupled segmentation and edge learning. In one example, the method includes: (1) receiving an input image, (2) generating, from the input image, a semantic feature map, an affinity map, and a semantic edge map from a single backbone network of a convolutional neural network (CNN), and (3) producing a refined semantic feature map by smoothing pixels of the semantic feature map using spatial propagation, and controlling the smoothing using both affinity values from the affinity map and edge values from the semantic edge map.
-
公开(公告)号:US20230004760A1
公开(公告)日:2023-01-05
申请号:US17361202
申请日:2021-06-28
Applicant: NVIDIA Corporation
Inventor: Siva Karthik Mustikovela , Shalini De Mello , Aayush Prakash , Umar Iqbal , Sifei Liu , Jan Kautz
IPC: G06K9/62
Abstract: Apparatuses, systems, and techniques to identify objects within an image using self-supervised machine learning. In at least one embodiment, a machine learning system is trained to recognize objects by training a first network to recognize objects within images that are generated by a second network. In at least one embodiment, the second network is a controllable network.
-
公开(公告)号:US20220270318A1
公开(公告)日:2022-08-25
申请号:US17734244
申请日:2022-05-02
Applicant: NVIDIA Corporation
Inventor: Xueting Li , Sifei Liu , Kihwan Kim , Shalini De Mello , Jan Kautz
Abstract: A three-dimensional (3D) object reconstruction neural network system learns to predict a 3D shape representation of an object from a video that includes the object. The 3D reconstruction technique may be used for content creation, such as generation of 3D characters for games, movies, and 3D printing. When 3D characters are generated from video, the content may also include motion of the character, as predicted based on the video. The 3D object construction technique exploits temporal consistency to reconstruct a dynamic 3D representation of the object from an unlabeled video. Specifically, an object in a video has a consistent shape and consistent texture across multiple frames. Texture, base shape, and part correspondence invariance constraints may be applied to fine-tune the neural network system. The reconstruction technique generalizes well—particularly for non-rigid objects.
-
公开(公告)号:US20220222832A1
公开(公告)日:2022-07-14
申请号:US17570254
申请日:2022-01-06
Applicant: NVIDIA Corporation
Inventor: Yang Fu , Sifei Liu , Umar Iqbal , Shalini De Mello , Jan Kautz
IPC: G06T7/246 , G06T7/73 , G06T7/11 , G06V10/764 , G06V10/774 , G06V10/77 , G06V10/82
Abstract: A method and system are provided for tracking instances within a sequence of video frames. The method includes the steps of processing an image frame by a backbone network to generate a set of feature maps, processing the set of feature maps by one or more prediction heads, and analyzing the embedding features corresponding to a set of instances in two or more image frames of the sequence of video frames to establish a one-to-one correlation between instances in different image frames. The one or more prediction heads includes an embedding head configured to generate a set of embedding features corresponding to one or more instances of an object identified in the image frame. The method may also include training the one or more prediction heads using a set of annotated image frames and/or a plurality of sequences of unlabeled video frames.
-
公开(公告)号:US11328169B2
公开(公告)日:2022-05-10
申请号:US16353835
申请日:2019-03-14
Applicant: NVIDIA Corporation
Inventor: Sifei Liu , Shalini De Mello , Jinwei Gu , Varun Jampani , Jan Kautz
Abstract: A temporal propagation network (TPN) system learns the affinity matrix for video image processing tasks. An affinity matrix is a generic matrix that defines the similarity of two points in space. The TPN system includes a guidance neural network model and a temporal propagation module and is trained for a particular computer vision task to propagate visual properties from a key-frame represented by dense data (color), to another frame that is represented by coarse data (grey-scale). The guidance neural network model generates an affinity matrix referred to as a global transformation matrix from task-specific data for the key-frame and the other frame. The temporal propagation module applies the global transformation matrix to the key-frame property data to produce propagated property data (color) for the other frame. For example, the TPN system may be used to colorize several frames of greyscale video using a single manually colorized key-frame.
-
公开(公告)号:US20210073575A1
公开(公告)日:2021-03-11
申请号:US17081805
申请日:2020-10-27
Applicant: NVIDIA Corporation
Inventor: Sifei Liu , Shalini De Mello , Jinwei Gu , Varun Jampani , Jan Kautz
Abstract: A temporal propagation network (TPN) system learns the affinity matrix for video image processing tasks. An affinity matrix is a generic matrix that defines the similarity of two points in space. The TPN system includes a guidance neural network model and a temporal propagation module and is trained for a particular computer vision task to propagate visual properties from a key-frame represented by dense data (color), to another frame that is represented by coarse data (grey-scale). The guidance neural network model generates an affinity matrix referred to as a global transformation matrix from task-specific data for the key-frame and the other frame. The temporal propagation module applies the global transformation matrix to the key-frame property data to produce propagated property data (color) for the other frame. For example, the TPN system may be used to colorize several frames of greyscale video using a single manually colorized key-frame.
-
公开(公告)号:US20190213439A1
公开(公告)日:2019-07-11
申请号:US16353835
申请日:2019-03-14
Applicant: NVIDIA Corporation
Inventor: Sifei Liu , Shalini De Mello , Jinwei Gu , Varun Jampani , Jan Kautz
CPC classification number: G06K9/6215 , G06K9/00744 , G06K9/6256 , G06N3/04 , G06N3/08 , G06N3/084 , G06T5/009 , G06T5/50 , G06T7/10 , G06T7/90 , G06T2207/10016 , G06T2207/20081 , G06T2207/20084 , G06T2207/20208
Abstract: A temporal propagation network (TPN) system learns the affinity matrix for video image processing tasks. An affinity matrix is a generic matrix that defines the similarity of two points in space. The TPN system includes a guidance neural network model and a temporal propagation module and is trained for a particular computer vision task to propagate visual properties from a key-frame represented by dense data (color), to another frame that is represented by coarse data (grey-scale). The guidance neural network model generates an affinity matrix referred to as a global transformation matrix from task-specific data for the key-frame and the other frame. The temporal propagation module applies the global transformation matrix to the key-frame property data to produce propagated property data (color) for the other frame. For example, the TPN system may be used to colorize several frames of greyscale video using a single manually colorized key-frame.
-
-
-
-
-
-
-
-
-