-
公开(公告)号:US20240411709A1
公开(公告)日:2024-12-12
申请号:US18810657
申请日:2024-08-21
Applicant: NVIDIA Corporation
Inventor: William James Dally , Carl Thomas Gray , Stephen W. Keckler , James Michael O'Connor
IPC: G06F13/16 , G11C8/12 , H03K19/1776
Abstract: Embodiments of the present disclosure relate to application partitioning for locality in a stacked memory system. In an embodiment, one or more memory dies are stacked on the processor die. The processor die includes multiple processing tiles and each memory die includes multiple memory tiles. Vertically aligned memory tiles are directly coupled to and comprise the local memory block for a corresponding processing tile. An application program that operates on dense multi-dimensional arrays (matrices) may partition the dense arrays into sub-arrays associated with program tiles. Each program tile is executed by a processing tile using the processing tile's local memory block to process the associated sub-array. Data associated with each sub-array is stored in a local memory block and the processing tile corresponding to the local memory block executes the program tile to process the sub-array data.
-
公开(公告)号:US12167169B1
公开(公告)日:2024-12-10
申请号:US17933186
申请日:2022-09-19
Applicant: NVIDIA Corporation
Inventor: Siddha Ganju , Ruthie Lyle , Naveen Kumar Rai , Ronay Ak , Andrew Russell
Abstract: A digital avatar system can process video streams and generate synthetic video with a digital avatar. The digital avatar provides the appearance of a participant from the video stream talking and one or more of performing various behaviors or actions consistent with the participant's behavior when they are live streamed. A digital avatar system can detect triggering events during a live stream and automatically switch to an avatar mode.
-
公开(公告)号:US12164059B2
公开(公告)日:2024-12-10
申请号:US17377064
申请日:2021-07-15
Applicant: NVIDIA Corporation
Inventor: Nikolai Smolyanskiy , Ryan Oldja , Ke Chen , Alexander Popov , Joachim Pehserl , Ibrahim Eden , Tilman Wekel , David Wehr , Ruchi Bhargava , David Nister
IPC: G01S7/48 , B60W60/00 , G01S17/89 , G01S17/931 , G05D1/00 , G06N3/045 , G06T19/00 , G06V10/10 , G06V10/25 , G06V10/26 , G06V10/44 , G06V10/764 , G06V10/774 , G06V10/80 , G06V10/82 , G06V20/56 , G06V20/58
Abstract: A deep neural network(s) (DNN) may be used to detect objects from sensor data of a three dimensional (3D) environment. For example, a multi-view perception DNN may include multiple constituent DNNs or stages chained together that sequentially process different views of the 3D environment. An example DNN may include a first stage that performs class segmentation in a first view (e.g., perspective view) and a second stage that performs class segmentation and/or regresses instance geometry in a second view (e.g., top-down). The DNN outputs may be processed to generate 2D and/or 3D bounding boxes and class labels for detected objects in the 3D environment. As such, the techniques described herein may be used to detect and classify animate objects and/or parts of an environment, and these detections and classifications may be provided to an autonomous vehicle drive stack to enable safe planning and control of the autonomous vehicle.
-
公开(公告)号:US20240406154A1
公开(公告)日:2024-12-05
申请号:US18528603
申请日:2023-12-04
Applicant: NVIDIA Corporation
Inventor: Miriam Menes , Naveen Cherukuri , Ahmad Atamli , Uria Basher , Mike Osborn , Mark Hummel , Liron Mula
IPC: H04L9/40
Abstract: Technologies for encrypting communication links between devices are described. A method includes generating a first initialization vector (IV), from a first subspace of IVs, for a first cryptographic ordered flow, and a second IV, from a second subspace of IVs that are mutually exclusive from the first subspace. The first and second cryptographic ordered flows share a key to secure multipath routing in a fabric between devices. The method sends, to the second device, a first packet for the first cryptographic ordered flow and a second packet for the second cryptographic ordered flow. The first packet includes a first security tag with the first IV and a first payload encrypted using the first IV and a first key. The second packet includes a second security tag with the second IV and a second payload encrypted using the second IV and a second key.
-
公开(公告)号:US20240401975A1
公开(公告)日:2024-12-05
申请号:US18326730
申请日:2023-05-31
Applicant: NVIDIA Corporation
Inventor: Alexander Korovko , Aigul Dzhumamuralova
Abstract: In various examples, sensor fusion for visual-inertial odometry in autonomous and semi-autonomous systems and applications is described herein. Systems and methods are disclosed that split processing into at least two components. For example, the first component may be configured to process incoming frames, execute one or more perspective-n-point techniques to determine states of a machine, update states associated with one or more inertial measurement unit sensors of the machine, and add new frames to a map. The second component may be configured to adjust states (e.g., poses) associated with the machine using one or more sparse bundle adjustment techniques, adjust points within an environment, and adjust IMU-related parameters using a history of camera states. In some examples, the PnP technique and/or the SBA technique may be selected based on states associated with the IMU sensor(s).
-
116.
公开(公告)号:US20240400097A1
公开(公告)日:2024-12-05
申请号:US18674551
申请日:2024-05-24
Applicant: NVIDIA Corporation
Inventor: David Nister
Abstract: Costs associated with configurations corresponding to a maneuver type(s) may be stored in a transition state(s) volume. The same memory volume may be used for storing cost values that correspond different maneuver types and different vertices in a graph of a configuration space. In at least one embodiment, to share a memory volume between maneuver types, the system may determine a cost for a machine to reach a configuration of a configuration space using various different maneuver types. The system may then evaluate one or more of the costs to determine which of the costs to store at one or more memory location(s) corresponding to the configuration (e.g., a point in a memory volume). Cost values for the memory volume may be efficiently determined using kernel-style processing.
-
公开(公告)号:US12159344B2
公开(公告)日:2024-12-03
申请号:US18339166
申请日:2023-06-21
Applicant: NVIDIA CORPORATION
Inventor: Robert A. Alfieri , Peter S. Shirley
Abstract: One embodiment of a computer-implemented method for processing ray tracing operations in parallel includes receiving a plurality of rays and a corresponding set of importance sampling instructions for each ray included in the plurality of rays for processing, wherein each ray represents a path from a light source to at least one point within a three-dimensional (3D) environment, and each corresponding set of importance sampling instruction is based at least in part on one or more material properties associated with at least one surface of at least one object included in the 3D environment; assigning each ray included in the plurality of rays to a different processing core included in a plurality of processing cores; and for each ray included in the plurality of rays, causing the processing core assigned to the ray to execute the corresponding set of importance sampling instructions on the ray to generate a direction for a secondary ray that is produced when the ray intersects a surface of an object within the 3D environment.
-
公开(公告)号:US20240395027A1
公开(公告)日:2024-11-28
申请号:US18322940
申请日:2023-05-24
Applicant: NVIDIA Corporation
Inventor: Rui Shen , Sebastian Michael Agethen , Jian xing Zhang
Abstract: In various examples, multilabel hierarchical classification of objects for autonomous systems and applications is described herein. Systems and methods are disclosed that use one or more neural networks to classify objects, such as traffic signs, using multilabel classification and/or hierarchical classification. For instance, a multilabel subnetwork of the neural network(s) may classify an object based at least on one or more attributes associated with the object. As such, the output from the multilabel subnetwork may include at least a classification associated with the object and an attribute classification(s) associated with the object. A hierarchical subnetwork of the neural network(s) may also classify the object using one or more class labels, where a class label indicates another classification and/or a class group associated with the object. The systems and methods may then use the classification, the attribute classification(s), and/or the class label(s) to determine a final classification associated with the object.
-
公开(公告)号:US12154214B2
公开(公告)日:2024-11-26
申请号:US17941578
申请日:2022-09-09
Applicant: NVIDIA Corporation
Inventor: Gregory Muthler , John Burgess , Magnus Andersson , Timo Viitanen , Levi Oliver
Abstract: An alternate root tree or graph structure for ray and path tracing enables dynamic instancing build time decisions to split any number of geometry acceleration structures in a manner that is developer transparent, nearly memory storage neutral, and traversal efficient. The resulting traversals only need to partially traverse the acceleration structure, which improves efficiency. One example use reduces the number of false positive instance acceleration structure to geometry acceleration structure transitions for many spatially separated instances of the same geometry.
-
公开(公告)号:US12154188B2
公开(公告)日:2024-11-26
申请号:US17890849
申请日:2022-08-18
Applicant: NVIDIA Corporation
Inventor: Fnu Ratnesh Kumar , Farzin Aghdasi , Parthasarathy Sriram , Edwin Weill
IPC: G06T1/20 , G06F17/18 , G06N3/045 , G06N3/047 , G06N3/08 , G06V10/764 , G06V10/82 , G06V20/52 , G06V20/58
Abstract: In various examples, a neural network may be trained for use in vehicle re-identification tasks—e.g., matching appearances and classifications of vehicles across frames—in a camera network. The neural network may be trained to learn an embedding space such that embeddings corresponding to vehicles of the same identify are projected closer to one another within the embedding space, as compared to vehicles representing different identities. To accurately and efficiently learn the embedding space, the neural network may be trained using a contrastive loss function or a triplet loss function. In addition, to further improve accuracy and efficiency, a sampling technique—referred to herein as batch sample—may be used to identify embeddings, during training, that are most meaningful for updating parameters of the neural network.
-
-
-
-
-
-
-
-
-