Patent search ap:("NVIDIA Corporation") AND inv:"Kyrylo Perelygin" Page 1

1.

发明公开
APPLICATION PROGRAMMING INTERFACE TO STOP PERFORMANCE OF THREADS 审中-公开

公开(公告)号：US20240036945A1

公开(公告)日：2024-02-01

申请号：US17955153

申请日：2022-09-28

Applicant: NVIDIA Corporation

Inventor： Ze Long , Kyrylo Perelygin , Harold Carter Edwards , Gokul Ramaswamy Hirisave Chandra Shekhara , Jaydeep Marathe , Ronny Meir Krashinsky , Girish Bhaskarrao Bharambe

IPC: G06F9/52 , G06F9/54

CPC classification number: G06F9/522 , G06F9/545

Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to cause performance of one or more threads within a group of blocks of threads to stop at least until all threads within the group of blocks have performed a barrier instruction.

2.

发明公开
APPLICATION PROGRAMMING INTERFACE TO INDICATE PERFORMANCE OF BARRIER INSTRUCTION 审中-公开

公开(公告)号：US20240036944A1

公开(公告)日：2024-02-01

申请号：US17955143

申请日：2022-09-28

Applicant: NVIDIA Corporation

Inventor： Ze Long , Kyrylo Perelygin , Harold Carter Edwards , Gokul Ramaswamy Hirisave Chandra Shekhara , Jaydeep Marathe , Ronny Meir Krashinsky , Girish Bhaskarrao Bharambe

IPC: G06F9/52 , G06F9/54

CPC classification number: G06F9/522 , G06F9/545

Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to indicate whether one or more threads within two or more blocks of threads have performed a barrier instruction.

3.

发明申请
SYNCHRONIZATION BARRIER 有权

公开(公告)号：US20220413945A1

公开(公告)日：2022-12-29

申请号：US17366770

申请日：2021-07-02

Applicant: NVIDIA Corporation

Inventor： Piotr Ciolkosz , Kyrylo Perelygin , Harold Carter Edwards , Wesley Maxey

IPC: G06F9/52 , G06F9/30 , G06F9/38

Abstract: Apparatuses, systems, and techniques to implement a barrier operation. In at least one embodiment, a memory barrier operation causes accesses to memory by a plurality of groups of threads to occur in an order indicated by the memory barrier operation.

4.

发明授权
Application programming interface to wait on matrix multiply-accumulate 有权

公开(公告)号：US12204897B2

公开(公告)日：2025-01-21

申请号：US18072081

申请日：2022-11-30

Applicant: NVIDIA Corporation

Inventor： Harold Carter Edwards , Kyrylo Perelygin , Maciej Tyrlik , Gokul Ramaswamy Hirisave Chandra Shekhara , Balaji Krishna Yugandhar Atukuri , Rishkul Kulkarni , Konstantinos Kyriakopoulos , Edward H. Gornish , David Allan Berson , Bageshri Sathe , James Player , Aman Arora , Alan Kaatz , Andrew Kerr , Haicheng Wu , Cris Cecka , Vijay Thakkar , Sean Treichler , Jack H. Choquette , Aditya Avinash Atluri , Apoorv Parle , Ronny Meir Krashinsky , Cody Addison , Girish Bhaskarrao Bharambe

IPC: G06F9/30 , G06F9/38 , G06F17/16

Abstract: Apparatuses, systems, and techniques to perform computational operations in response to one or more compute uniform device architecture (CUDA) programs. In at least one embodiment, one or more computational operations are to cause one or more other computational operations to wait until a portion of matrix multiply-accumulate (MMA) operations have been performed.

5.

发明公开
APPLICATION PROGRAMMING INTERFACE TO INDICATE OPERATIONS TO BE PERFORMED BY CORRESPONDING STREAMING MULTIPROCESSORS 审中-公开

公开(公告)号：US20240168763A1

公开(公告)日：2024-05-23

申请号：US18072300

申请日：2022-11-30

Applicant: NVIDIA Corporation

Inventor： Harold Carter Edwards , Kyrylo Perelygin , Maciej Tyrlik , Gokul Ramaswamy Hirisave Chandra Shekhara , Balaji Krishna Yugandhar Atukuri , Rishkul Kulkarni , Konstantinos Kyriakopoulos , Edward H. Gornish , David Allan Berson , Bageshri Sathe , James Player , Aman Arora , Alan Kaatz , Andrew Kerr , Haicheng Wu , Cris Cecka , Vijay Thakkar , Sean Treichler , Jack H. Choquette , Aditya Avinash Atluri , Apoorv Parle , Ronny Meir Krashinsky , Cody Addison , Girish Bhaskarrao Bharambe

IPC: G06F9/30 , G06F17/16

CPC classification number: G06F9/3001 , G06F17/16

Abstract: Apparatuses, systems, and techniques to perform computational operations in response to one or more compute uniform device architecture (CUDA) programs. In at least one embodiment, one or more computational operations are to cause two or more other computational operations to be performed by two or more streaming multiprocessors (SMs).

6.

发明公开
APPLICATION PROGRAMMING INTERFACE TO WAIT ON MATRIX MULTIPLY-ACCUMULATE 审中-公开

公开(公告)号：US20240168762A1

公开(公告)日：2024-05-23

申请号：US18072081

申请日：2022-11-30

Applicant: NVIDIA Corporation

Inventor： Harold Carter Edwards , Kyrylo Perelygin , Maciej Tyrlik , Gokul Ramaswamy Hirisave Chandra Shekhara , Balaji Krishna Yugandhar Atukuri , Rishkul Kulkarni , Konstantinos Kyriakopoulos , Edward H. Gornish , David Allan Berson , Bageshri Sathe , James Player , Aman Arora , Alan Kaatz , Andrew Kerr , Haicheng Wu , Cris Cecka , Vijay Thakkar , Sean Treichler , Jack H. Choquette , Aditya Avinash Atluri , Apoorv Parle , Ronny Meir Krashinsky , Cody Addison , Girish Bhaskarrao Bharambe

IPC: G06F9/30 , G06F17/16

CPC classification number: G06F9/3001 , G06F9/3009 , G06F17/16

Abstract: Apparatuses, systems, and techniques to perform computational operations in response to one or more compute uniform device architecture (CUDA) programs. In at least one embodiment, one or more computational operations are to cause one or more other computational operations to wait until a portion of matrix multiply-accumulate (MMA) operations have been performed.

7.

发明公开
APPLICATION PROGRAMMING INTERFACE TO SCHEDULE THREAD BLOCKS 审中-公开

公开(公告)号：US20240036952A1

公开(公告)日：2024-02-01

申请号：US17955052

申请日：2022-09-28

Applicant: NVIDIA Corporation

Inventor： Ze Long , Kyrylo Perelygin , Harold Carter Edwards , Gokul Ramaswamy Hirisave Chandra Shekhara , Jaydeep Marathe , Ronny Meir Krashinsky , Girish Bhaskarrao Bharambe

IPC: G06F9/54 , G06F9/48

CPC classification number: G06F9/544 , G06F9/4881

Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to determine which of two or more blocks of threads are to be scheduled in parallel.

8.

发明公开
APPLICATION PROGRAMMING INTERFACE TO INDICATE THREAD BLOCKS 审中-公开

公开(公告)号：US20240036951A1

公开(公告)日：2024-02-01

申请号：US17955023

申请日：2022-09-28

Applicant: NVIDIA Corporation

Inventor： Ze Long , Kyrylo Perelygin , Harold Carter Edwards , Gokul Ramaswamy Hirisave Chandra Shekhara , Jaydeep Marathe , Ronny Meir Krashinsky , Girish Bhaskarrao Bharambe

IPC: G06F9/54 , G06F9/48

CPC classification number: G06F9/544 , G06F9/4881

Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to indicate two or more blocks of threads to be scheduled in parallel.

9.

发明公开
APPLICATION PROGRAMMING INTERFACE TO SHARE DATA WITH THREADS 审中-公开

公开(公告)号：US20240289186A1

公开(公告)日：2024-08-29

申请号：US18433786

申请日：2024-02-06

Applicant: NVIDIA Corporation

Inventor： Piotr Tomasz Ciolkosz , Kyrylo Perelygin , Harold Carter Edwards , Gonzalo Brito Gadeschi , Georgii Evtushenko , Jake Hemstad , Vishalkumar Ketankumar Mehta , Michal Dominiak , Olivier Giroux , Konstantinos Kyriakopoulos

IPC: G06F9/54 , G06F9/38

CPC classification number: G06F9/541 , G06F9/3889

Abstract: Apparatuses, systems, and techniques to perform an application programming interface (API) to select a single thread from a group of threads to perform a set of instructions and to broadcast a result of performance of said set of instructions to said group of threads. In at least one embodiment, processors or computer systems are to perform an API to indicate instructions to be performed by a single thread and to select that thread from a group of threads to perform said instructions, and to make available to said group of threads data generated as a result of performance of said instructions.

10.

发明公开
APPLICATION PROGRAMMING INTERFACE TO SYNCHRONIZE MATRIX MULTIPLY-ACCUMULATE MEMORY TRANSACTIONS 审中-公开

公开(公告)号：US20240169022A1

公开(公告)日：2024-05-23

申请号：US18072053

申请日：2022-11-30

Applicant: NVIDIA Corporation

Inventor： Harold Carter Edwards , Kyrylo Perelygin , Maciej Tyrlik , Gokul Ramaswamy Hirisave Chandra Shekhara , Balaji Krishna Yugandhar Atukuri , Rishkul Kulkarni , Konstantinos Kyriakopoulos , Edward H. Gornish , David Allan Berson , Bageshri Sathe , James Player , Aman Arora , Alan Kaatz , Andrew Kerr , Haicheng Wu , Cris Cecka , Vijay Thakkar , Sean Treichler , Jack H. Choquette , Aditya Avinash Atluri , Apoorv Parle , Ronny Meir Krashinsky , Cody Addison , Girish Bhaskarrao Bharambe

IPC: G06F17/16 , G06F9/30

CPC classification number: G06F17/16 , G06F9/3001 , G06F9/3009

Abstract: Apparatuses, systems, and techniques to perform computational operations in response to one or more compute uniform device architecture (CUDA) programs. In at least one embodiment, one or more computational operations are to cause one or more other computational operations to wait until matrix multiply-accumulate (MMA) memory transactions are performed.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification