-
公开(公告)号:US20230095092A1
公开(公告)日:2023-03-30
申请号:US17957143
申请日:2022-09-30
申请人: Nvidia Corporation
发明人: Zhisheng Xiao , Karsten Kreis , Arash Vahdat
IPC分类号: G06T5/00
摘要: Apparatuses, systems, and techniques are presented to train and utilize one or more neural networks. A denoising diffusion generative adversarial network (denoising diffusion GAN) reduces a number of denoising steps during a reverse process. The denoising diffusion GAN does not assume a Gaussian distribution for large steps of the denoising process and applies a multi-model model to permit denoising with fewer steps. Systems and methods further minimize a divergence between a diffused real data distribution and a diffused generator distribution over several timesteps. Accordingly, various embodiments may enable faster sample generation, in which the samples are generated from noise using the denoising diffusion GAN.
-
公开(公告)号:US20220383570A1
公开(公告)日:2022-12-01
申请号:US17827394
申请日:2022-05-27
申请人: NVIDIA Corporation
发明人: Huan Ling , Karsten Kreis , Daiqing Li , Seung Wook Kim , Antonio Torralba Barriuso , Sanja Fidler
IPC分类号: G06T11/60 , G06T7/10 , G06V10/776 , G06V10/774
摘要: In various examples, high-precision semantic image editing for machine learning systems and applications are described. For example, a generative adversarial network (GAN) may be used to jointly model images and their semantic segmentations based on a same underlying latent code. Image editing may be achieved by using segmentation mask modifications (e.g., provided by a user, or otherwise) to optimize the latent code to be consistent with the updated segmentation, thus effectively changing the original, e.g., RGB image. To improve efficiency of the system, and to not require optimizations for each edit on each image, editing vectors may be learned in latent space that realize the edits, and that can be directly applied on other images with or without additional optimizations. As a result, a GAN in combination with the optimization approaches described herein may simultaneously allow for high precision editing in real-time with straightforward compositionality of multiple edits.
-
公开(公告)号:US20240161403A1
公开(公告)日:2024-05-16
申请号:US18232279
申请日:2023-08-09
申请人: NVIDIA Corporation
发明人: Chen-Hsuan Lin , Tsung-Yi Lin , Ming-Yu Liu , Sanja Fidler , Karsten Kreis , Luming Tang , Xiaohui Zeng , Jun Gao , Xun Huang , Towaki Takikawa
CPC分类号: G06T17/20 , G06T3/40 , G06T15/04 , G06T17/005 , G06T19/20
摘要: Text-to-image generation generally refers to the process of generating an image from one or more text prompts input by a user. While artificial intelligence has been a valuable tool for text-to-image generation, current artificial intelligence-based solutions are more limited as it relates to text-to-3D content creation. For example, these solutions are oftentimes category-dependent, or synthesize 3D content at a low resolution. The present disclosure provides a process and architecture for high-resolution text-to-3D content creation.
-
-