-
公开(公告)号:US20230131961A1
公开(公告)日:2023-04-27
申请号:US17508290
申请日:2021-10-22
Applicant: NVIDIA Corporation
Inventor: Kyrylo Perelygin , Alicia Xiao Hu
Abstract: Apparatuses, systems, and techniques to configure processor partitioning for a multi-process service. In at least one embodiment, a multi-process service configures a set of streaming multiprocessors of one or more parallel processing units to perform one or more threads in response to an application programming interface (API).
-
公开(公告)号:US20230118662A1
公开(公告)日:2023-04-20
申请号:US17504041
申请日:2021-10-18
Applicant: NVIDIA Corporation
Inventor: Alicia Xiao Hu , Kyrylo Perelygin
IPC: G06F9/50 , G06F9/30 , G06F9/4401
Abstract: Apparatuses, systems, and techniques to configure processor partitioning for a multi-process service. In at least one embodiment, a multi-process service configures a set of streaming multiprocessors of one or more parallel processing units to perform one or more threads based on one or more user-defined data values accessible to a parallel processing library, such as compute uniform device architecture (CUDA).
-
公开(公告)号:US20230111125A1
公开(公告)日:2023-04-13
申请号:US17497731
申请日:2021-10-08
Applicant: NVIDIA Corporation
Inventor: Piotr Ciolkosz , Kyrylo Perelygin , Harold Carter Edwards , Wesley Maxey
Abstract: Apparatuses, systems, and techniques to perform parallel processing. In at least one embodiment, a parallel processing algorithm for performing an additive prefix scan is selected from a plurality of alternatives based on an arrangement of a group of threads provided to perform the scan.
-
公开(公告)号:US20240289129A1
公开(公告)日:2024-08-29
申请号:US18433741
申请日:2024-02-06
Applicant: NVIDIA Corporation
Inventor: Piotr Tomasz Ciolkosz , Kyrylo Perelygin , Harold Carter Edwards , Gonzalo Brito Gadeschi , Georgii Evtushenko , Jake Hemstad , Vishalkumar Ketankumar Mehta , Michal Dominiak , Olivier Giroux , Konstantinos Kyriakopoulos
IPC: G06F9/30
CPC classification number: G06F9/3009 , G06F9/30181
Abstract: Apparatuses, systems, and techniques to perform an application programming interface (API) to select a single thread from a group of threads to perform a set of instructions. In at least one embodiment, processors or computer systems are to perform an API to indicate instructions to be performed by a single thread and to select that thread from a group of threads to perform said instructions.
-
公开(公告)号:US20240169023A1
公开(公告)日:2024-05-23
申请号:US18072060
申请日:2022-11-30
Applicant: NVIDIA Corporation
Inventor: Harold Carter Edwards , Kyrylo Perelygin , Maciej Tyrlik , Gokul Ramaswamy Hirisave Chandra Shekhara , Balaji Krishna Yugandhar Atukuri , Rishkul Kulkarni , Konstantinos Kyriakopoulos , Edward H. Gornish , David Allan Berson , Bageshri Sathe , James Player , Aman Arora , Alan Kaatz , Andrew Kerr , Haicheng Wu , Cris Cecka , Vijay Thakkar , Sean Treichler , Jack H. Choquette , Aditya Avinash Atluri , Apoorv Parle , Ronny Meir Krashinsky , Cody Addison , Girish Bhaskarrao Bharambe
IPC: G06F17/16
CPC classification number: G06F17/16
Abstract: Apparatuses, systems, and techniques to perform computational operations in response to one or more compute uniform device architecture (CUDA) programs. In at least one embodiment, one or more computational operations are to indicate whether matrix multiply-accumulate (MMA) memory operations are complete.
-
26.
公开(公告)号:US20240036954A1
公开(公告)日:2024-02-01
申请号:US17955106
申请日:2022-09-28
Applicant: NVIDIA Corporation
Inventor: Ze Long , Kyrylo Perelygin , Harold Carter Edwards , Gokul Ramaswamy Hirisave Chandra Shekhara , Jaydeep Marathe , Ronny Meir Krashinsky , Girish Bhaskarrao Bharambe
CPC classification number: G06F9/544 , G06F9/4881
Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to indicate one or more attributes of one or more groups of blocks of one or more threads.
-
公开(公告)号:US20240036917A1
公开(公告)日:2024-02-01
申请号:US17955110
申请日:2022-09-28
Applicant: NVIDIA Corporation
Inventor: Ze Long , Kyrylo Perelygin , Harold Carter Edwards , Gokul Ramaswamy Hirisave Chandra Shekhara , Jaydeep Marathe , Ronny Meir Krashinsky , Girish Bhaskarrao Bharambe
CPC classification number: G06F9/4881 , G06F9/5044 , G06F9/545
Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to indicate a maximum number of blocks of threads to be scheduled in parallel.
-
公开(公告)号:US20230305853A1
公开(公告)日:2023-09-28
申请号:US17705154
申请日:2022-03-25
Applicant: NVIDIA Corporation
Inventor: Piotr Ciolkosz , Kyrylo Perelygin , Harold Carter Edwards , Wesley Maxey
CPC classification number: G06F9/3851 , G06T15/005
Abstract: Apparatuses, systems, and techniques to perform collective operations using parallel processing. In at least one embodiment, a non-blocking application programming interface allow programs to improve performance of one or more collective operations on a GPU.
-
公开(公告)号:US20230222619A1
公开(公告)日:2023-07-13
申请号:US17575471
申请日:2022-01-13
Applicant: NVIDIA Corporation
Inventor: David Anthony Fontaine , Maciej Marcin Piechotka , Kyrylo Perelygin , Lukasz Krystian Ligowski , Ashutosh Jain , Jitendra Pratap Singh Chauhan , Jaydeep Marathe , Magnus Strengert , Xiaonan Tian , Sebastian Piotr Jodlowski , John Clifton Woolley, JR.
CPC classification number: G06T1/20 , G06F9/5077 , G06T1/60 , G06F9/547
Abstract: Apparatuses, systems, and techniques to indicate contextual information to be used by available logical processors. In at least one embodiment, one or more circuits are to perform an application programming interface (API) to indicate a first set of contextual information to be used by a first subset of available processors.
-
公开(公告)号:US20230086989A1
公开(公告)日:2023-03-23
申请号:US17478079
申请日:2021-09-17
Applicant: NVIDIA Corporation
Inventor: Piotr Ciolkosz , Kyrylo Perelygin , Harold Carter Edwards , Wesley Maxey
Abstract: Apparatuses, systems, and techniques to facilitate parallel processing. In at least one embodiment, an application programming interface allows a user to define a plurality of cooperative thread groups, and launch multiple cooperative thread groups in parallel provided sufficient processing resources are available.
-
-
-
-
-
-
-
-
-