-
公开(公告)号:US20240169468A1
公开(公告)日:2024-05-23
申请号:US18081561
申请日:2022-12-14
Applicant: NVIDIA Corporation
Inventor: Harold Carter Edwards , Olivier Giroux , Jack H. Choquette , Gokul Ramaswamy Hirisave Chandra Shekhara , Rui Guo , Chao Li , Vishalkumar Ketankumar Mehta , David Dastous St. Hilaire , Aditya Avinash Atluri , Apoorv Parle , Ronny Meir Krashinsky , Subhasmita Chakraborty , Vikram Dhar
Abstract: Apparatuses, systems, and techniques to cause information to be provided. In at least one embodiment, one or more circuits are to perform an application programming interface (API) to cause an amount of information to be accessed as a result of one or more memory transactions to be provided to one or more users.
-
22.
公开(公告)号:US20240169022A1
公开(公告)日:2024-05-23
申请号:US18072053
申请日:2022-11-30
Applicant: NVIDIA Corporation
Inventor: Harold Carter Edwards , Kyrylo Perelygin , Maciej Tyrlik , Gokul Ramaswamy Hirisave Chandra Shekhara , Balaji Krishna Yugandhar Atukuri , Rishkul Kulkarni , Konstantinos Kyriakopoulos , Edward H. Gornish , David Allan Berson , Bageshri Sathe , James Player , Aman Arora , Alan Kaatz , Andrew Kerr , Haicheng Wu , Cris Cecka , Vijay Thakkar , Sean Treichler , Jack H. Choquette , Aditya Avinash Atluri , Apoorv Parle , Ronny Meir Krashinsky , Cody Addison , Girish Bhaskarrao Bharambe
CPC classification number: G06F17/16 , G06F9/3001 , G06F9/3009
Abstract: Apparatuses, systems, and techniques to perform computational operations in response to one or more compute uniform device architecture (CUDA) programs. In at least one embodiment, one or more computational operations are to cause one or more other computational operations to wait until matrix multiply-accumulate (MMA) memory transactions are performed.
-
公开(公告)号:US20240168632A1
公开(公告)日:2024-05-23
申请号:US18081534
申请日:2022-12-14
Applicant: NVIDIA Corporation
Inventor: Harold Carter Edwards , Olivier Giroux , Jack H. Choquette , Gokul Ramaswamy Hirisave Chandra Shekhara , Rui Guo , Chao Li , Vishalkumar Ketankumar Mehta , David Dastous St. Hilaire , Aditya Avinash Atluri , Apoorv Parle , Ronny Meir Krashinsky , Subhasmita Chakraborty , Vikram Dhar
IPC: G06F3/06
CPC classification number: G06F3/061 , G06F3/0659 , G06F3/0673
Abstract: Apparatuses, systems, and techniques to facilitate asynchronous data movement accounting. In at least one embodiment, one or more circuits are to perform an application programming interface (API) to cause one or more memory transactions to be performed without storing information about the one or more memory transactions.
-
公开(公告)号:US20240161223A1
公开(公告)日:2024-05-16
申请号:US18086464
申请日:2022-12-21
Applicant: NVIDIA Corporation
Inventor: Harold Carter Edwards , Stephen Anthony Bernard Jones , Alexander Lev Minkin , Olivier Giroux , Gokul Ramaswamy Hirisave Chandra Shekhara , Vishalkumar Ketankumar Mehta , Aditya Avinash Atluri , Apoorv Parle , Chao Li , Ronny Meir Krashinsky , Alan Kaatz , Andrew Robert Kerr , Jack H. Choquette
Abstract: Apparatuses, systems, and techniques to cause a first tensor to be translated into a second tensor according to a tensor map. In at least one embodiment, one or more circuits are to perform an application programming interface (API) to cause a first tensor to be translated into a second tensor according to a tensor map.
-
25.
公开(公告)号:US20240036957A1
公开(公告)日:2024-02-01
申请号:US17955175
申请日:2022-09-28
Applicant: NVIDIA Corporation
Inventor: Ze Long , Kyrylo Perelygin , Harold Carter Edwards , Gokul Ramaswamy Hirisave Chandra Shekhara , Jaydeep Marathe , Ronny Meir Krashinsky , Girish Bhaskarrao Bharambe
IPC: G06F9/54
Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to cause memory to be shared between two or more groups of blocks of threads.
-
公开(公告)号:US20240036955A1
公开(公告)日:2024-02-01
申请号:US17955133
申请日:2022-09-28
Applicant: NVIDIA Corporation
Inventor: Ze Long , Kyrylo Perelygin , Harold Carter Edwards , Gokul Ramaswamy Hirisave Chandra Shekhara , Jaydeep Marathe , Ronny Meir Krashinsky , Girish Bhaskarrao Bharambe
CPC classification number: G06F9/544 , G06F9/4881
Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to indicate one or more limitations of one or more attributes of one or more groups of blocks of one or more threads.
-
公开(公告)号:US20240036916A1
公开(公告)日:2024-02-01
申请号:US17955094
申请日:2022-09-28
Applicant: NVIDIA Corporation
Inventor: Ze Long , Kyrylo Perelygin , Harold Carter Edwards , Gokul Ramaswamy Hirisave Chandra Shekhara , Jaydeep Marathe , Ronny Meir Krashinsky , Girish Bhaskarrao Bharambe
CPC classification number: G06F9/4881 , G06F9/545 , G06F9/5044
Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to indicate a maximum number of blocks of threads capable of being scheduled in parallel.
-
公开(公告)号:US20240036915A1
公开(公告)日:2024-02-01
申请号:US17955070
申请日:2022-09-28
Applicant: NVIDIA Corporation
Inventor: Ze Long , Kyrylo Perelygin , Harold Carter Edwards , Gokul Ramaswamy Hirisave Chandra Shekhara , Jaydeep Marathe , Ronny Meir Krashinsky , Girish Bhaskarrao Bharambe
CPC classification number: G06F9/4881 , G06F9/505
Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to determine a scheduling policy of one or more blocks of one or more threads.
-
公开(公告)号:US11294713B2
公开(公告)日:2022-04-05
申请号:US16825831
申请日:2020-03-20
Applicant: NVIDIA Corporation
Inventor: Harold Carter Edwards
Abstract: Apparatuses, systems, and techniques to parallelize operations in one or more programs with data copies from global memory to shared memory in each of the one or more programs. In at least one embodiment, a program performs operations on shared data and then asynchronously copies shared data to shared memory, and continues performing additional operations in parallel while the shared data is copied to shared memory until an indicator provided by an application programming interface to facilitate parallel computing, such as CUDA, informs said program that shared data has been copied to shared memory.
-
公开(公告)号:US20250123969A1
公开(公告)日:2025-04-17
申请号:US18381559
申请日:2023-10-18
Applicant: NVIDIA Corporation
Inventor: Harold Carter Edwards , Daniel Joseph Lustig , Gonzalo Brito Gadeschi , Subhasmita Chakraborty , Gokul Ramaswamy Hirisave Chandra Shekhara
IPC: G06F12/0891 , G06F12/0811
Abstract: Apparatuses, systems, and techniques to cause information to be invalidated in a second cache location after information is stored in a first cache location. In at least one embodiment, one or more circuits are to perform an application programming interface (API) to cause information to be invalidated in a second cache location after information is stored in a first cache location.
-
-
-
-
-
-
-
-
-