-
公开(公告)号:US20240193887A1
公开(公告)日:2024-06-13
申请号:US18361587
申请日:2023-07-28
Applicant: NVIDIA Corporation
Inventor: Zekun Hao , Ming-Yu Liu , Arun Mohanray Mallya
IPC: G06T19/20
CPC classification number: G06T19/20 , G06F30/10 , G06T2210/56 , G06T2219/2021
Abstract: Synthesis of high-quality 3D shapes with smooth surfaces has various creative and practical use cases, such as 3D content creation and CAD modeling. A vector field decoder neural network is trained to predict a generative vector field (GVF) representation of a 3D shape from a latent representation (latent code or feature volume) of the 3D shape. The GVF representation is agnostic to surface orientation, all dimensions of the vector field vary smoothly, the GVF can represent both watertight and non-watertight 3D shapes, and there is a one-to-one mapping between a predicted 3D shape and the ground truth 3D shape (i.e., the mapping is bijective). The vector field decoder can synthesize 3D shapes in multiple categories and can also synthesize 3D shapes for objects that were not included in the training dataset. In other words, the vector field decoder is also capable of zero-shot generation.
-
公开(公告)号:US11983815B2
公开(公告)日:2024-05-14
申请号:US17718172
申请日:2022-04-11
Applicant: NVIDIA Corporation
Inventor: Tianchang Shen , Jun Gao , Kangxue Yin , Ming-Yu Liu , Sanja Fidler
CPC classification number: G06T17/20 , G06T7/50 , G06T2207/10028 , G06T2207/20081 , G06T2207/20084
Abstract: In various examples, a deep three-dimensional (3D) conditional generative model is implemented that can synthesize high resolution 3D shapes using simple guides—such as coarse voxels, point clouds, etc.—by marrying implicit and explicit 3D representations into a hybrid 3D representation. The present approach may directly optimize for the reconstructed surface, allowing for the synthesis of finer geometric details with fewer artifacts. The systems and methods described herein may use a deformable tetrahedral grid that encodes a discretized signed distance function (SDF) and a differentiable marching tetrahedral layer that converts the implicit SDF representation to an explicit surface mesh representation. This combination allows joint optimization of the surface geometry and topology as well as generation of the hierarchy of subdivisions using reconstruction and adversarial losses defined explicitly on the surface mesh.
-
公开(公告)号:US20240144568A1
公开(公告)日:2024-05-02
申请号:US17903585
申请日:2022-09-06
Applicant: Nvidia Corporation
Inventor: Siddharth Gururani , Arun Mallya , Ting-Chun Wang , Jose Rafael Valle da Costa , Ming-Yu Liu
CPC classification number: G06T13/205 , G06V10/82 , G06V40/171
Abstract: Apparatuses, systems, and techniques are presented to generate digital content. In at least one embodiment, one or more neural networks are used to generate video information based at least in part upon voice information and a combination of image features and facial landmarks corresponding to one or more images of a person.
-
公开(公告)号:US20240114162A1
公开(公告)日:2024-04-04
申请号:US17955734
申请日:2022-09-29
Applicant: Nvidia Corporation
Inventor: Aurobinda Maharana , Arun Mallya , Ming-Yu Liu , Abhijit Patait
Abstract: Systems and methods herein address reference frame selection in video streaming applications using one or more processing units to decode a frame of an encoded video stream that uses an inter-frame depicting an object and an intra-frame depicting the object, the intra-frame being included in a set of intra-frames based at least in part on at least one attribute of the object as depicted in the intra-frame being different from the at least one attribute of the object as depicted in other intra-frames of the set of intra-frames.
-
公开(公告)号:US20240095989A1
公开(公告)日:2024-03-21
申请号:US17945951
申请日:2022-09-15
Applicant: NVIDIA Corporation
Inventor: Arun Mohanray Mallya , Ting-Chun Wang , Ming-Yu Liu
CPC classification number: G06T13/20 , G06T7/20 , G06V10/25 , G06V10/443 , G06V10/761 , G06V10/771 , G06V10/82 , G06T2207/20081 , G06T2207/30252
Abstract: Apparatuses, systems, and techniques to generate a video using two or more images comprising objects to be included in the video. In at least one embodiment, objects are identified in two or more images using one or more neural networks, to generate a video to include the objects in the video.
-
公开(公告)号:US11610435B2
公开(公告)日:2023-03-21
申请号:US17069478
申请日:2020-10-13
Applicant: NVIDIA Corporation
Inventor: Tero Tapani Karras , Samuli Matias Laine , David Patrick Luebke , Jaakko T. Lehtinen , Miika Samuli Aittala , Timo Oskari Aila , Ming-Yu Liu , Arun Mohanray Mallya , Ting-Chun Wang
Abstract: A latent code defined in an input space is processed by the mapping neural network to produce an intermediate latent code defined in an intermediate latent space. The intermediate latent code may be used as appearance vector that is processed by the synthesis neural network to generate an image. The appearance vector is a compressed encoding of data, such as video frames including a person's face, audio, and other data. Captured images may be converted into appearance vectors at a local device and transmitted to a remote device using much less bandwidth compared with transmitting the captured images. A synthesis neural network at the remote device reconstructs the images for display.
-
7.
公开(公告)号:US20200334502A1
公开(公告)日:2020-10-22
申请号:US16921012
申请日:2020-07-06
Applicant: NVIDIA Corporation
Inventor: Wei-Chih Tu , Ming-Yu Liu , Varun Jampani , Deqing Sun , Ming-Hsuan Yang , Jan Kautz
Abstract: Segmentation is the identification of separate objects within an image. An example is identification of a pedestrian passing in front of a car, where the pedestrian is a first object and the car is a second object. Superpixel segmentation is the identification of regions of pixels within an object that have similar properties. An example is identification of pixel regions having a similar color, such as different articles of clothing worn by the pedestrian and different components of the car. A pixel affinity neural network (PAN) model is trained to generate pixel affinity maps for superpixel segmentation. The pixel affinity map defines the similarity of two points in space. In an embodiment, the pixel affinity map indicates a horizontal affinity and vertical affinity for each pixel in the image. The pixel affinity map is processed to identify the superpixels.
-
公开(公告)号:US10748036B2
公开(公告)日:2020-08-18
申请号:US16188641
申请日:2018-11-13
Applicant: NVIDIA Corporation
Inventor: Wei-Chih Tu , Ming-Yu Liu , Varun Jampani , Deqing Sun , Ming-Hsuan Yang , Jan Kautz
Abstract: Segmentation is the identification of separate objects within an image. An example is identification of a pedestrian passing in front of a car, where the pedestrian is a first object and the car is a second object. Superpixel segmentation is the identification of regions of pixels within an object that have similar properties An example is identification of pixel regions having a similar color, such as different articles of clothing worn by the pedestrian and different components of the car. A pixel affinity neural network (PAN) model is trained to generate pixel affinity maps for superpixel segmentation. The pixel affinity map defines the similarity of two points in space. In an embodiment, the pixel affinity map indicates a horizontal affinity and vertical affinity for each pixel in the image. The pixel affinity map is processed to identify the superpixels.
-
公开(公告)号:US20200242736A1
公开(公告)日:2020-07-30
申请号:US16261395
申请日:2019-01-29
Applicant: Nvidia Corporation
Inventor: Ming-Yu Liu , Xun Huang , Tero Karras , Timo Aila , Jaakko Lehtinen
Abstract: A few-shot, unsupervised image-to-image translation (“FUNIT”) algorithm is disclosed that accepts as input images of previously-unseen target classes. These target classes are specified at inference time by only a few images, such as a single image or a pair of images, of an object of the target type. A FUNIT network can be trained using a data set containing images of many different object classes, in order to translate images from one class to another class by leveraging few input images of the target class. By learning to extract appearance patterns from the few input images for the translation task, the network learns a generalizable appearance pattern extractor that can be applied to images of unseen classes at translation time for a few-shot image-to-image translation task.
-
公开(公告)号:US20190355103A1
公开(公告)日:2019-11-21
申请号:US16353195
申请日:2019-03-14
Applicant: NVIDIA Corporation
Inventor: Seung-Hwan Baek , Kihwan Kim , Jinwei Gu , Orazio Gallo , Alejandro Jose Troccoli , Ming-Yu Liu , Jan Kautz
Abstract: Missing image content is generated using a neural network. In an embodiment, a high resolution image and associated high resolution semantic label map are generated from a low resolution image and associated low resolution semantic label map. The input image/map pair (low resolution image and associated low resolution semantic label map) lacks detail and is therefore missing content. Rather than simply enhancing the input image/map pair, data missing in the input image/map pair is improvised or hallucinated by a neural network, creating plausible content while maintaining spatio-temporal consistency. Missing content is hallucinated to generate a detailed zoomed in portion of an image. Missing content is hallucinated to generate different variations of an image, such as different seasons or weather conditions for a driving video.
-
-
-
-
-
-
-
-
-