-
公开(公告)号:US20250045622A1
公开(公告)日:2025-02-06
申请号:US18230311
申请日:2023-08-04
Applicant: NVIDIA Corporation
Inventor: Shekhar Dwivedi , Rahul Choudhury
IPC: G06N20/00
Abstract: Apparatuses, systems, and techniques for efficient profiling, scheduling, and batch execution of multiple machine learning models (MLMs). Efficient batch execution includes obtaining execution metrics characterizing expected utilization of computational resources by the MLMs, and generating at least one batch queue having one or more MLM batches of MLMs with a combined expected utilization not exceeding a threshold utilization, and initiating parallel execution of the MLMs using the generated MLM batches.
-
公开(公告)号:US12217331B2
公开(公告)日:2025-02-04
申请号:US17955380
申请日:2022-09-28
Applicant: NVIDIA Corporation
Inventor: Shekhar Dwivedi
Abstract: Disclosed are apparatuses, systems, and techniques that enable compressed grid-based graph representations for efficient implementations of graph-mapped computing applications. The techniques include but are not limited to selecting a reference grid having a plurality of blocks, assigning nodes of the graph to blocks of the grid, and generating a graph representation that maps directions, relative to the reference grid, of nodal connections of the graph.
-
公开(公告)号:US11037338B2
公开(公告)日:2021-06-15
申请号:US16376449
申请日:2019-04-05
Applicant: Nvidia Corporation
Inventor: Shekhar Dwivedi
Abstract: This disclosure introduces an approach that includes techniques for determining an optimal weighted execution sequence of available reconstruction algorithms using a multi-processor unit. The introduced approach includes executing a series of optimal weighted execution sequence candidates on a representative slice of the image data and comparing their results to select one of the candidates as the optimal weighted execution sequence.
-
公开(公告)号:US20230097169A1
公开(公告)日:2023-03-30
申请号:US17491341
申请日:2021-09-30
Applicant: NVIDIA Corporation
Inventor: Shekhar Dwivedi , Nicholas Alexander Haemel
Abstract: Apparatuses, systems, and techniques are disclosed to generate a derived artificial intelligence (AI) model from a plurality of AI models. In at least one embodiment, at least one common feature shared among the plurality of AI models are identified, and the derived AI model is generated based on the at least one common feature shared among the plurality of AI models.
-
5.
公开(公告)号:US20220261287A1
公开(公告)日:2022-08-18
申请号:US17174951
申请日:2021-02-12
Applicant: NVIDIA Corporation
Inventor: Shekhar Dwivedi , Andreas Heumann
Abstract: Systems and methods for improving the degree to which programs utilize processor resources during execution. A number of different versions of a program are received, as is a set of performance metrics describing desired performance of the program versions. The programs are then analyzed to determine the amount of processor resources used on a particular processor when the programs are executed to meet the performance metrics. At runtime, a program version that meets its performance metrics without exceeding the available processor resources is selected for execution by the processor. Program versions may be versions written to utilize processors in differing manner, such as by adjusting the numerical precision at which operations are performed or stored. If no program version meets its performance metrics without exceeding the available processor resources, the performance metrics may be reduced and program selection may be based on these reduced performance metrics.
-
公开(公告)号:US10699447B2
公开(公告)日:2020-06-30
申请号:US16570865
申请日:2019-09-13
Applicant: Nvidia Corporation
Inventor: Shekhar Dwivedi
Abstract: A plurality of processors with logic units to train one or more neural networks for image construction, at least in part, using established one or more levels of compression for image data from a region of interest (ROI).
-
公开(公告)号:US12045666B2
公开(公告)日:2024-07-23
申请号:US17249194
申请日:2021-02-23
Applicant: NVIDIA Corporation
Inventor: Shekhar Dwivedi , Rahul Choudhury
IPC: G06F9/50 , G06F11/30 , G06F11/34 , G06F16/901
CPC classification number: G06F9/5083 , G06F11/3006 , G06F11/3409 , G06F16/9024
Abstract: Apparatuses, systems, and techniques to collect performance data for one or more computations tasks executed by a plurality of nodes of a computational pipeline and enable optimization of distribution of task execution among the plurality of nodes.
-
公开(公告)号:US20240104790A1
公开(公告)日:2024-03-28
申请号:US17955380
申请日:2022-09-28
Applicant: NVIDIA Corporation
Inventor: Shekhar Dwivedi
Abstract: Disclosed are apparatuses, systems, and techniques that enable compressed grid-based graph representations for efficient implementations of graph-mapped computing applications. The techniques include but are not limited to selecting a reference grid having a plurality of blocks, assigning nodes of the graph to blocks of the grid, and generating a graph representation that maps directions, relative to the reference grid, of nodal connections of the graph.
-
公开(公告)号:US20200090383A1
公开(公告)日:2020-03-19
申请号:US16570865
申请日:2019-09-13
Applicant: Nvidia Corporation
Inventor: Shekhar Dwivedi
Abstract: A plurality of processors with logic units to train one or more neural networks for image construction, at least in part, using established one or more levels of compression for image data from a region of interest (ROI).
-
公开(公告)号:US20250045604A1
公开(公告)日:2025-02-06
申请号:US18229929
申请日:2023-08-03
Applicant: NVIDIA Corporation
Inventor: Shekhar Dwivedi , Rahul Choudhury
Abstract: Apparatuses, systems, and frameworks for provisioning of efficient pipelines capable of multi-model inference and data processing, including streaming data applications. The disclosed techniques allow efficient deployment and execution of multiple machine learning using pluggable inference and data processing backends by users without specialized developer experience.
-
-
-
-
-
-
-
-
-