-
公开(公告)号:US20240177414A1
公开(公告)日:2024-05-30
申请号:US18071821
申请日:2022-11-30
Applicant: Hsin-Ying Lee , Jian Ren , Aliaksandr Siarohin , Ivan Skorokhodov , Sergey Tulyakov , Yinghao Xu
Inventor: Hsin-Ying Lee , Jian Ren , Aliaksandr Siarohin , Ivan Skorokhodov , Sergey Tulyakov , Yinghao Xu
CPC classification number: G06T17/00 , G06T7/50 , G06T7/90 , G06V10/82 , G06T2207/10024
Abstract: A three-dimensional (3D) scene is generated from non-aligned generic camera priors by producing a tri-plane representation for an input scene received in random latent code, obtaining a camera posterior including posterior parameters representing color and density data from the random latent code and from generic camera priors without alignment assumptions, and volumetrically rendering an image of the input scene from the color and density data to provide a scene having pixel colors and depth values from an arbitrary camera viewpoint. A depth adaptor processes depth values to generate an adapted depth map that bridges domains of rendered and estimated depth maps for the image of the input scene. The adapted depth map, color data, and scene geometry information from an external dataset are provided to a discriminator for selection of a 3D representation of the input scene.
-
2.
公开(公告)号:US20240193855A1
公开(公告)日:2024-06-13
申请号:US18080089
申请日:2022-12-13
Applicant: Menglei Chai , Hsin-Ying Lee , Aliaksandr Siarohin , Sergey Tulyakov , Yinghao Xu , Ivan Skorokhodov
Inventor: Menglei Chai , Hsin-Ying Lee , Aliaksandr Siarohin , Sergey Tulyakov , Yinghao Xu , Ivan Skorokhodov
Abstract: A 3D-aware generative model for high-quality and controllable scene synthesis uses an abstract object-level representation (i.e., 3D bounding boxes without semantic annotation) as the scene layout prior, which is simple to obtain, general to describe various scene contents, and yet informative to disentangle objects and background. An overall layout for the scene is identified and then each object is located in the layout to facilitate the scene composition process. The object-level representation serves as an intuitive user control for scene editing. Based on such a prior, the system spatially disentangles the whole scene into object-centric generative radiance fields by learning on only 2D images with global-local discrimination. Once the model is trained, users can generate and edit a scene by explicitly controlling the camera and the layout of objects' bounding boxes.
-
公开(公告)号:US20240221258A1
公开(公告)日:2024-07-04
申请号:US18089984
申请日:2022-12-28
Applicant: Menglei Chai , Hsin-Ying Lee , Willi Menapace , Kyle Olszewski , Jian Ren , Aliaksandr Siarohin , Ivan Skorokhodov , Sergey Tulyakov
Inventor: Menglei Chai , Hsin-Ying Lee , Willi Menapace , Kyle Olszewski , Jian Ren , Aliaksandr Siarohin , Ivan Skorokhodov , Sergey Tulyakov
CPC classification number: G06T13/40 , G06T7/70 , G06T19/20 , G06T2207/20081 , G06T2207/20084 , G06T2207/30201 , G06T2219/2004 , G06T2219/2021
Abstract: Unsupervised volumetric 3D animation (UVA) of non-rigid deformable objects without annotations learns the 3D structure and dynamics of objects solely from single-view red/green/blue (RGB) videos and decomposes the single-view RGB videos into semantically meaningful parts that can be tracked and animated. Using a 3D autodecoder framework, paired with a keypoint estimator via a differentiable perspective-n-point (PnP) algorithm, the UVA model learns the underlying object 3D geometry and parts decomposition in an entirely unsupervised manner from still or video images. This allows the UVA model to perform 3D segmentation, 3D keypoint estimation, novel view synthesis, and animation. The UVA model can obtain animatable 3D objects from a single or a few images. The UVA method also features a space in which all objects are represented in their canonical, animation-ready form. Applications include the creation of lenses from images or videos for social media applications.