-
公开(公告)号:US20240104842A1
公开(公告)日:2024-03-28
申请号:US18472653
申请日:2023-09-22
Applicant: NVIDIA Corporation
Inventor: Koki Nagano , Alexander Trevithick , Chao Liu , Eric Ryan Chan , Sameh Khamis , Michael Stengel , Zhiding Yu
IPC: G06T17/00 , G06T5/20 , G06T7/70 , G06T7/90 , G06V10/771
CPC classification number: G06T17/00 , G06T5/20 , G06T7/70 , G06T7/90 , G06V10/771 , G06T2207/10024
Abstract: A method for generating, by an encoder-based model, a three-dimensional (3D) representation of a two-dimensional (2D) image is provided. The encoder-based model is trained to infer the 3D representation using a synthetic training data set generated by a pre-trained model. The pre-trained model is a 3D generative model that produces a 3D representation and a corresponding 2D rendering, which can be used to train a separate encoder-based model for downstream tasks like estimating a triplane representation, neural radiance field, mesh, depth map, 3D key points, or the like, given a single input image, using the pseudo ground truth 3D synthetic training data set. In a particular embodiment, the encoder-based model is trained to predict a triplane representation of the input image, which can then be rendered by a volume renderer according to pose information to generate an output image of the 3D scene from the corresponding viewpoint.