-
公开(公告)号:US20200074707A1
公开(公告)日:2020-03-05
申请号:US16201934
申请日:2018-11-27
Applicant: NVIDIA CORPORATION
Inventor: Donghoon LEE , Sifei LIU , Jinwei GU , Ming-Yu LIU , Jan KAUTZ
Abstract: One embodiment of a method includes applying a first generator model to a semantic representation of an image to generate an affine transformation, where the affine transformation represents a bounding box associated with at least one region within the image. The method further includes applying a second generator model to the affine transformation and the semantic representation to generate a shape of an object. The method further includes inserting the object into the image based on the bounding box and the shape.
-
公开(公告)号:US20180075611A1
公开(公告)日:2018-03-15
申请号:US15823370
申请日:2017-11-27
Applicant: NVIDIA Corporation
Inventor: Gregory P. MEYER , Shalini GUPTA , Iuri FROSIO , Nagilla Dikpal REDDY , Jan KAUTZ
CPC classification number: G06T7/507 , G06K9/6276 , G06T7/251 , G06T7/277 , G06T7/70 , G06T7/75 , G06T7/77 , G06T2200/28 , G06T2207/10016 , G06T2207/10028 , G06T2207/30201
Abstract: One embodiment of the present invention sets forth a technique for estimating a head pose of a user. The technique includes acquiring depth data associated with a head of the user and initializing each particle included in a set of particles with a different candidate head pose. The technique further includes performing one or more optimization passes that include performing at least one iterative closest point (ICP) iteration for each particle and performing at least one particle swarm optimization (PSO) iteration. Each ICP iteration includes rendering the three-dimensional reference model based on the candidate head pose associated with the particle and comparing the three-dimensional reference model to the depth data. Each PSO iteration comprises updating a global best head pose associated with the set of particles and modifying at least one candidate head pose. The technique further includes modifying a shape of the three-dimensional reference model based on depth data.
-
公开(公告)号:US20170046827A1
公开(公告)日:2017-02-16
申请号:US14825129
申请日:2015-08-12
Applicant: NVIDIA CORPORATION
Inventor: Gregory P. MEYER , Shalini GUPTA , Iuri FROSIO , Nagilla Dikpal REDDY , Jan KAUTZ
IPC: G06T7/00
CPC classification number: G06T7/507 , G06K9/6276 , G06T7/251 , G06T7/277 , G06T7/70 , G06T7/75 , G06T7/77 , G06T2200/28 , G06T2207/10016 , G06T2207/10028 , G06T2207/30201
Abstract: One embodiment of the present invention sets forth a technique for estimating a head pose of a user. The technique includes acquiring depth data associated with a head of the user and initializing each particle included in a set of particles with a different candidate head pose. The technique further includes performing one or more optimization passes that include performing at least one iterative closest point (ICP) iteration for each particle and performing at least one particle swarm optimization (PSO) iteration. Each ICP iteration includes rendering the three-dimensional reference model based on the candidate head pose associated with the particle and comparing the three-dimensional reference model to the depth data. Each PSO iteration comprises updating a global best head pose associated with the set of particles and modifying at least one candidate head pose. The technique further includes modifying a shape of the three-dimensional reference model based on depth data.
Abstract translation: 本发明的一个实施例提出了一种用于估计用户的头部姿势的技术。 该技术包括获取与用户头部相关联的深度数据并且初始化包含在具有不同候选头姿势的一组粒子中的每个粒子。 该技术还包括执行一个或多个优化遍,包括对每个粒子执行至少一个迭代最近点(ICP)迭代并且执行至少一个粒子群优化(PSO)迭代。 每个ICP迭代包括基于与粒子相关联的候选头部姿态来渲染三维参考模型,并将三维参考模型与深度数据进行比较。 每个PSO迭代包括更新与该组粒子相关联的全局最佳头部姿态并修改至少一个候选头姿势。 该技术还包括基于深度数据修改三维参考模型的形状。
-
公开(公告)号:US20250094819A1
公开(公告)日:2025-03-20
申请号:US18471184
申请日:2023-09-20
Applicant: NVIDIA CORPORATION
Inventor: Wonmin BYEON , Sudarshan BABU , Shalini DE MELLO , Jan KAUTZ
IPC: G06N3/096 , G06N3/0455
Abstract: One embodiment of the present invention sets forth a technique for executing a transformer neural network. The technique includes executing a first attention unit included in the transformer neural network to convert a first input token into a first query, a first key, and a first plurality of values, where each value included in the first plurality of values represents a sub-task associated with the transformer neural network. The technique also includes computing a first plurality of outputs associated with the first input token based on the first query, the first key, and the first plurality of values. The technique further includes performing a task associated with an input corresponding to the first input token based on the first input token and the first plurality of outputs.
-
15.
公开(公告)号:US20240161404A1
公开(公告)日:2024-05-16
申请号:US18497938
申请日:2023-10-30
Applicant: NVIDIA CORPORATION
Inventor: Yang FU , Sifei LIU , Jan KAUTZ , Xueting LI , Shalini DE MELLO , Amey KULKARNI , Milind NAPHADE
IPC: G06T17/20
CPC classification number: G06T17/20
Abstract: In various embodiments, a training application trains a machine learning model to generate three-dimensional (3D) representations of two-dimensional images. The training application maps a depth image and a viewpoint to signed distance function (SDF) values associated with 3D query points. The training application maps a red, blue, and green (RGB) image to radiance values associated with the 3DI query points. The training application computes a red, blue, green, and depth (RGBD) reconstruction loss based on at least the SDF values and the radiance values. The training application modifies at least one of a pre-trained geometry encoder, a pre-trained geometry decoder, an untrained texture encoder, or an untrained texture decoder based on the RGBD reconstruction loss to generate a trained machine learning model that generates 3D representations of RGBD images.
-
16.
公开(公告)号:US20230319218A1
公开(公告)日:2023-10-05
申请号:US18173603
申请日:2023-02-23
Applicant: NVIDIA Corporation
Inventor: Yuzhuo REN , Nuri Murat ARAR , Orazio GALLO , Jan KAUTZ , Niranjan AVADHANAM , Hang SU
CPC classification number: H04N5/2624 , G06V20/56
Abstract: In various examples, a state machine is used to select between a default seam placement or dynamic seam placement that avoids salient regions, and to enable and disable dynamic seam placement based on speed of ego-motion, direction of ego-motion, proximity to salient objects, active viewport, driver gaze, and/or other factors. Images representing overlapping views of an environment may be aligned to create an aligned composite image or surface (e.g., a panorama, a 360° image, bowl shaped surface) with overlapping regions of image data, and a default or dynamic seam placement may be selected based on driving scenario (e.g., driving direction, speed, proximity to nearby objects). As such, seams may be positioned in the overlapping regions of image data, and the image data may be blended at the seams to create a stitched image or surface (e.g., a stitched panorama, stitched 360° image, stitched textured surface).
-
17.
公开(公告)号:US20230316635A1
公开(公告)日:2023-10-05
申请号:US18173623
申请日:2023-02-23
Applicant: NVIDIA Corporation
Inventor: Hairong JIANG , Nuri Murat ARAR , Orazio GALLO , Jan KAUTZ , Ronan LETOQUIN
CPC classification number: G06T15/20 , G06T17/20 , G06T19/20 , G06T7/70 , G06T2219/2004
Abstract: In various examples, an environment surrounding an ego-object is visualized using an adaptive 3D bowl that models the environment with a shape that changes based on distance (and direction) to one or more representative point(s) on detected objects. Distance (and direction) to detected objects may be determined using 3D object detection or a top-down 2D or 3D occupancy grid, and used to adapt the shape of the adaptive 3D bowl in various ways (e.g., by sizing its ground plane to fit within the distance to the closest detected object, fitting a shape using an optimization algorithm). The adaptive 3D bowl may be enabled or disabled during each time slice (e.g., based on ego-speed), and the 3D bowl for each time slice may be used to render a visualization of the environment (e.g., a top-down projection image, a textured 3D bowl, and/or a rendered view thereof).
-
公开(公告)号:US20230267659A1
公开(公告)日:2023-08-24
申请号:US17933811
申请日:2022-09-20
Applicant: NVIDIA CORPORATION
Inventor: Benjamin ECKART , Jan KAUTZ , Chao LIU , Benjamin WU
CPC classification number: G06T11/006 , G06F17/141 , G01B9/02041
Abstract: In various embodiments, an inference application reconstructs representations of items in a spectral domain. The inference application maps a first set of data points associated with a both an item and the spectral domain to conditioning information via a first trained machine learning model. The inference application updates a second trained machine learning model based on the conditioning information to generate a model that represents the item within the spectral domain. The inference application generates a second set of data points associated with both the item and the spectral domain via the model. The inference application constructs an image associated with the item based on the second set of data points.
-
公开(公告)号:US20220398697A1
公开(公告)日:2022-12-15
申请号:US17681625
申请日:2022-02-25
Applicant: NVIDIA CORPORATION
Inventor: Arash VAHDAT , Karsten KREIS , Jan KAUTZ
Abstract: One embodiment of the present invention sets forth a technique for generating data. The technique includes sampling from a first distribution associated with the score-based generative model to generate a first set of values. The technique also includes performing one or more denoising operations via the score-based generative model to convert the first set of values into a first set of latent variable values associated with a latent space. The technique further includes converting the first set of latent variable values into a generative output.
-
公开(公告)号:US20220335672A1
公开(公告)日:2022-10-20
申请号:US17585449
申请日:2022-01-26
Applicant: NVIDIA Corporation
Inventor: Donghoon LEE , Sifei LIU , Jinwei GU , Ming-Yu LIU , Jan KAUTZ
IPC: G06T11/60 , G06T3/00 , G06K9/62 , G06T7/30 , G06V30/262
Abstract: One embodiment of a method includes applying a first generator model to a semantic representation of an image to generate an affine transformation, where the affine transformation represents a bounding box associated with at least one region within the image. The method further includes applying a second generator model to the affine transformation and the semantic representation to generate a shape of an object. The method further includes inserting the object into the image based on the bounding box and the shape.
-
-
-
-
-
-
-
-
-