-
公开(公告)号:US20250111474A1
公开(公告)日:2025-04-03
申请号:US18830914
申请日:2024-09-11
Applicant: NVIDIA Corporation
Inventor: Koki Nagano , Alexander Trevithick , Matthew Aaron Wong Chan , Towaki Takikawa , Umar Iqbal , Shalini De Mello
IPC: G06T3/4046 , G06T5/60 , G06T5/70 , G06T15/08
Abstract: Systems and methods are disclosed that relate to synthesizing high-resolution 3D geometry and strictly view-consistent images that maintain image quality without relying on post-processing super resolution. For instance, embodiments of the present disclosure describe techniques, systems, and/or methods to scale neural volume rendering to the much higher resolution of native 2D images, thereby resolving fine-grained 3D geometry with unprecedented detail. Embodiments of the present disclosure employ learning-based samplers for accelerating neural rendering for 3D GAN training using up to five times fewer depth samples, which enables embodiments of the present disclosure to explicitly “render every pixel” of the full-resolution image during training and inference without post-processing super-resolution in 2D. Together with learning high-quality surface geometry, embodiments of the present disclosure synthesize high-resolution 3D geometry and strictly view—consistent images while maintaining image quality on par with baselines relying on post-processing super resolution.
-
公开(公告)号:US20240135630A1
公开(公告)日:2024-04-25
申请号:US18485225
申请日:2023-10-11
Applicant: NVIDIA Corporation
Inventor: Koki Nagano , Eric Ryan Wong Chan , Tero Tapani Karras , Shalini De Mello , Miika Samuli Aittala , Matthew Aaron Wong Chan
IPC: G06T15/06 , G06T5/00 , G06T5/50 , G06V10/44 , G06V10/771
CPC classification number: G06T15/06 , G06T5/002 , G06T5/50 , G06V10/44 , G06V10/771 , G06T2207/20084 , G06T2207/20221
Abstract: A method and system for performing novel image synthesis using generative networks are provided. The encoder-based model is trained to infer a 3D representation of an input image. A feature image is then generated using volume rendering techniques in accordance with the 3D representation. The feature image is then concatenated with a noisy image and processed by a denoiser network to predict an output image from a novel viewpoint that is consistent with the input image. The denoiser network can be a modified Noise Conditional Score Network (NCSN). In some embodiments, multiple input images or keyframes can be provided as input, and a different 3D representation is generated for each input image. The feature image is then generated, during volume rendering, by sampling each of the 3D representations and applying a mean-pooling operation to generate an aggregate feature image.
-