-
公开(公告)号:US11635986B2
公开(公告)日:2023-04-25
申请号:US16562359
申请日:2019-09-05
Applicant: NVIDIA CORPORATION
Inventor: Jerome F. Duluk, Jr. , Gregory Scott Palmer , Jonathon Stuart Ramsey Evans , Shailendra Singh , Samuel H. Duncan , Wishwesh Anil Gandhi , Lacky V. Shah , Eric Rock , Feiqi Su , James Leroy Deming , Alan Menezes , Pranav Vaidya , Praveen Joginipally , Timothy John Purcell , Manas Mandal
Abstract: A parallel processing unit (PPU) can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tasks within that partition in isolation from any other guest users assigned to any other partitions. Because the PPU can be divided into isolated partitions, multiple CPU processes can efficiently utilize PPU resources.
-
公开(公告)号:US11893423B2
公开(公告)日:2024-02-06
申请号:US16562367
申请日:2019-09-05
Applicant: NVIDIA CORPORATION
Inventor: Jerome F. Duluk, Jr. , Gregory Scott Palmer , Jonathon Stuart Ramsey Evans , Shailendra Singh , Samuel H. Duncan , Wishwesh Anil Gandhi , Lacky V. Shah , Sonata Gale Wen , Feiqi Su , James Leroy Deming , Alan Menezes , Pranav Vaidya , Praveen Joginipally , Timothy John Purcell , Manas Mandal
IPC: G06F9/50 , G06F9/38 , G06F1/3296 , G06F1/04
CPC classification number: G06F9/5061 , G06F1/04 , G06F1/3296 , G06F9/3877 , G06F9/5027
Abstract: A parallel processing unit (PPU) can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tasks within that partition in isolation from any other guest users assigned to any other partitions. Because the PPU can be divided into isolated partitions, multiple CPU processes can efficiently utilize PPU resources.
-
公开(公告)号:US11663036B2
公开(公告)日:2023-05-30
申请号:US16562359
申请日:2019-09-05
Applicant: NVIDIA CORPORATION
Inventor: Jerome F. Duluk, Jr. , Gregory Scott Palmer , Jonathon Stuart Ramsey Evans , Shailendra Singh , Samuel H. Duncan , Wishwesh Anil Gandhi , Lacky V. Shah , Eric Rock , Feiqi Su , James Leroy Deming , Alan Menezes , Pranav Vaidya , Praveen Joginipally , Timothy John Purcell , Manas Mandal
Abstract: A parallel processing unit (PPU) can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tasks within that partition in isolation from any other guest users assigned to any other partitions. Because the PPU can be divided into isolated partitions, multiple CPU processes can efficiently utilize PPU resources.
-
公开(公告)号:US11307903B2
公开(公告)日:2022-04-19
申请号:US15885751
申请日:2018-01-31
Applicant: NVIDIA Corporation
Inventor: Jerome F. Duluk, Jr. , Luke Durant , Ramon Matas Navarro , Alan Menezes , Jeffrey Tuckey , Gentaro Hirota , Brian Pharris
Abstract: Embodiments of the present invention set forth techniques for allocating execution resources to groups of threads within a graphics processing unit. A compute work distributor included in the graphics processing unit receives an indication from a process that a first group of threads is to be launched. The compute work distributor determines that a first subcontext associated with the process has at least one processor credit. In some embodiments, CTAs may be launched even when there are no processor credits, if one of the TPCs that was already acquired has sufficient space. The compute work distributor identifies a first processor included in a plurality of processors that has a processing load that is less than or equal to the processor loads associated with all other processors included in the plurality of processors. The compute work distributor launches the first group of threads to execute on the first processor.
-
公开(公告)号:US20190235928A1
公开(公告)日:2019-08-01
申请号:US15885751
申请日:2018-01-31
Applicant: NVIDIA Corporation
Inventor: Jerome F. Duluk, JR. , Luke Durant , Ramon Matas Navarro , Alan Menezes , Jeffrey Tuckey , Gentaro Hirota , Brian Pharris
CPC classification number: G06F9/5061 , G06F9/45558 , G06F9/4881 , G06F9/505 , G06F2209/5018
Abstract: Embodiments of the present invention set forth techniques for allocating execution resources to groups of threads within a graphics processing unit. A compute work distributor included in the graphics processing unit receives an indication from a process that a first group of threads is to be launched. The compute work distributor determines that a first subcontext associated with the process has at least one processor credit. In some embodiments, CTAs may be launched even when there are no processor credits, if one of the TPCs that was already acquired has sufficient space. The compute work distributor identifies a first processor included in a plurality of processors that has a processing load that is less than or equal to the processor loads associated with all other processors included in the plurality of processors. The compute work distributor launches the first group of threads to execute on the first processor.
-
公开(公告)号:US11579925B2
公开(公告)日:2023-02-14
申请号:US16562364
申请日:2019-09-05
Applicant: NVIDIA CORPORATION
Inventor: Jerome F. Duluk, Jr. , Gregory Scott Palmer , Jonathon Stuart Ramsey Evans , Shailendra Singh , Samuel H. Duncan , Wishwesh Anil Gandhi , Lacky V. Shah , Eric Rock , Feiqi Su , James Leroy Deming , Alan Menezes , Pranav Vaidya , Praveen Joginipally , Timothy John Purcell , Manas Mandal
Abstract: A parallel processing unit (PPU) can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tasks within that partition in isolation from any other guest users assigned to any other partitions. Because the PPU can be divided into isolated partitions, multiple CPU processes can efficiently utilize PPU resources.
-
公开(公告)号:US11249905B2
公开(公告)日:2022-02-15
申请号:US16562361
申请日:2019-09-05
Applicant: NVIDIA CORPORATION
Inventor: Jerome F. Duluk, Jr. , Gregory Scott Palmer , Jonathon Stuart Ramsey Evans , Shailendra Singh , Samuel H. Duncan , Wishwesh Anil Gandhi , Lacky V. Shah , Eric Rock , Feiqi Su , James Leroy Deming , Alan Menezes , Pranav Vaidya , Praveen Joginipally , Timothy John Purcell , Manas Mandal
Abstract: A parallel processing unit (PPU) can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tasks within that partition in isolation from any other guest users assigned to any other partitions. Because the PPU can be divided into isolated partitions, multiple CPU processes can efficiently utilize PPU resources.
-
公开(公告)号:US10817338B2
公开(公告)日:2020-10-27
申请号:US15885761
申请日:2018-01-31
Applicant: NVIDIA Corporation
Inventor: Jerome F. Duluk, Jr. , Luke Durant , Ramon Matas Navarro , Alan Menezes , Jeffrey Tuckey , Gentaro Hirota , Brian Pharris
Abstract: Embodiments of the present invention set forth techniques for allocating execution resources to groups of threads within a graphics processing unit. A compute work distributor included in the graphics processing unit receives an indication from a process that a first group of threads is to be launched. The compute work distributor determines that a first subcontext associated with the process has at least one processor credit. In some embodiments, CTAs may be launched even when there are no processor credits, if one of the TPCs that was already acquired has sufficient space. The compute work distributor identifies a first processor included in a plurality of processors that has a processing load that is less than or equal to the processor loads associated with all other processors included in the plurality of processors. The compute work distributor launches the first group of threads to execute on the first processor.
-
公开(公告)号:US20190235924A1
公开(公告)日:2019-08-01
申请号:US15885761
申请日:2018-01-31
Applicant: NVIDIA Corporation
Inventor: Jerome F. Duluk, Jr. , Luke Durant , Ramon Matas Navarro , Alan Menezes , Jeffrey Tuckey , Gentaro Hirota , Brian Pharris
Abstract: Embodiments of the present invention set forth techniques for allocating execution resources to groups of threads within a graphics processing unit. A compute work distributor included in the graphics processing unit receives an indication from a process that a first group of threads is to be launched. The compute work distributor determines that a first subcontext associated with the process has at least one processor credit. In some embodiments, CTAs may be launched even when there are no processor credits, if one of the TPCs that was already acquired has sufficient space. The compute work distributor identifies a first processor included in a plurality of processors that has a processing load that is less than or equal to the processor loads associated with all other processors included in the plurality of processors. The compute work distributor launches the first group of threads to execute on the first processor.
-
-
-
-
-
-
-
-