-
公开(公告)号:US20250086090A1
公开(公告)日:2025-03-13
申请号:US18244206
申请日:2023-09-08
Applicant: NVIDIA Corporation
Inventor: Durgadoss R , Hariharan Sandanagobalane , Pradeep Kumar , Khaled Abdul Karim Mohammed , Vatsa Santhanam
Abstract: Apparatuses, systems, and techniques to perform variable value profiling. In at least one embodiment, variable value profiling is to be performed by one or more accelerators.
-
公开(公告)号:US20230305845A1
公开(公告)日:2023-09-28
申请号:US17710699
申请日:2022-03-31
Applicant: NVIDIA Corporation
Inventor: Harold Carter Edwards , Stephen Anthony Bernard Jones , David Anthony Fontaine , Sebastian Piotr Jodlowski , Aditya Avinash Atluri , Andrew Robert Kerr , Michael Andrew Clark , Gonzalo Brito Gadeschi , Olivier Giroux , Jaydeep Marathe , Thibaut Lutz , Hariharan Sandanagobalane , Gokul Ramaswamy Hirisave Chandra Shekhara , Girish Bhaskarrao Bharambe , Rishkul Kulkarni , Konstantinos Kyriakopoulos
CPC classification number: G06F9/3009 , G06F9/30043 , G06F9/544 , G06F9/5016
Abstract: Apparatuses, systems, and techniques to cause data to be selectively stored in one or more memory locations. In at least one embodiment, a processor is to cause data to be selectively stored in one or more memory locations based, at least in part, on one or more threads to use the data.
-
公开(公告)号:US20190108006A1
公开(公告)日:2019-04-11
申请号:US16154542
申请日:2018-10-08
Applicant: Nvidia Corporation
Inventor: Hariharan Sandanagobalane , Sean Lee , Vinod Grover
Abstract: System and method of compiling a program having a mixture of host code and device code to enable code coverage data collection for device code execution. An exemplary integrated compiler can compile source code programmed to be executed by a host processor (e.g., CPU) and a co-processor (e.g., a GPU) concurrently. The compilation can generate an instrumented executable code which includes: coverage instrumentation counters for the device functions; mapping information that maps the counters with the instrumented source points; and instructions for the host processor to allocate and initialize device memory for the counters and to retrieve collected code coverage information from the device memory to the host memory. Execution of the instrumented executable can yield a coverage report on the device code functions.
-
公开(公告)号:US11579852B2
公开(公告)日:2023-02-14
申请号:US16939313
申请日:2020-07-27
Applicant: NVIDIA Corporation
Inventor: Hariharan Sandanagobalane , Sean Lee , Vinod Grover
IPC: G06F8/41 , G06F11/36 , G06F16/903 , G06F16/901 , G06F9/445
Abstract: System and method of compiling a program having a mixture of host code and device code to enable Profile Guided Optimization (PGO) for device code execution. An exemplary integrated compiler can compile source code programmed to be executed by a host processor (e.g., CPU) and a co-processor (e.g., a GPU) concurrently. The compilation can generate an instrumented executable code which includes: profile instrumentation counters for the device functions; and instructions for the host processor to allocate and initialize device memory for the counters and to retrieve collected profile information from the device memory to generate instrumentation output. The output is fed back to the compiler for compiling the source code a second time to generate optimized executable code for the device functions defined in the source code.
-
公开(公告)号:US10853044B2
公开(公告)日:2020-12-01
申请号:US16154560
申请日:2018-10-08
Applicant: Nvidia Corporation
Inventor: Hariharan Sandanagobalane , Sean Lee , Vinod Grover
IPC: G06F8/41 , G06F11/36 , G06F16/903 , G06F16/901 , G06F9/445
Abstract: System and method of compiling a program having a mixture of host code and device code to enable Profile Guided Optimization (PGO) for device code execution. An exemplary integrated compiler can compile source code programmed to be executed by a host processor (e.g., CPU) and a co-processor (e.g., a GPU) concurrently. The compilation can generate an instrumented executable code which includes: profile instrumentation counters for the device functions; and instructions for the host processor to allocate and initialize device memory for the counters and to retrieve collected profile information from the device memory to generate instrumentation output. The output is fed back to the compiler for compiling the source code a second time to generate optimized executable code for the device functions defined in the source code.
-
公开(公告)号:US20190146766A1
公开(公告)日:2019-05-16
申请号:US16154560
申请日:2018-10-08
Applicant: Nvidia Corporation
Inventor: Hariharan Sandanagobalane , Sean Lee , Vinod Grover
IPC: G06F8/41 , G06F16/901 , G06F16/903
Abstract: System and method of compiling a program having a mixture of host code and device code to enable Profile Guided Optimization (PGO) for device code execution. An exemplary integrated compiler can compile source code programmed to be executed by a host processor (e.g., CPU) and a co-processor (e.g., a GPU) concurrently. The compilation can generate an instrumented executable code which includes: profile instrumentation counters for the device functions; and instructions for the host processor to allocate and initialize device memory for the counters and to retrieve collected profile information from the device memory to generate instrumentation output. The output is fed back to the compiler for compiling the source code a second time to generate optimized executable code for the device functions defined in the source code.
-
-
-
-
-