-
公开(公告)号:US20210248014A1
公开(公告)日:2021-08-12
申请号:US16787967
申请日:2020-02-11
Applicant: NVIDIA Corporation
Inventor: Daniel Lustig , Oreste Villa , David Nellans
IPC: G06F9/50 , G06F9/54 , G06F12/0882 , G06F12/1027 , G06F11/07 , G06F11/30
Abstract: In general, an application executes on a compute unit, such as a central processing unit (CPU) or graphics processing unit (GPU), to perform some function(s). In some circumstances, improved performance of an application, such as a graphics application, may be provided by executing the application across multiple compute units. However, when using multiple compute units in this manner, synchronization must be provided between the compute units. Synchronization, including the sharing of the data, is typically accomplished through memory. While a shared memory may cause bottlenecks, employing local memory for each compute unit may itself require synchronization (coherence) which can be costly in terms of resources, delay, etc. The present disclosure provides read-write page replication for multiple compute units that avoids the traditional challenges associated with coherence.
-
公开(公告)号:US11625279B2
公开(公告)日:2023-04-11
申请号:US16787967
申请日:2020-02-11
Applicant: NVIDIA Corporation
Inventor: Daniel Lustig , Oreste Villa , David Nellans
IPC: G06F9/50 , G06F11/30 , G06F9/54 , G06F12/1027 , G06F11/07 , G06F12/0882
Abstract: In general, an application executes on a compute unit, such as a central processing unit (CPU) or graphics processing unit (GPU), to perform some function(s). In some circumstances, improved performance of an application, such as a graphics application, may be provided by executing the application across multiple compute units. However, when using multiple compute units in this manner, synchronization must be provided between the compute units. Synchronization, including the sharing of the data, is typically accomplished through memory. While a shared memory may cause bottlenecks, employing local memory for each compute unit may itself require synchronization (coherence) which can be costly in terms of resources, delay, etc. The present disclosure provides read-write page replication for multiple compute units that avoids the traditional challenges associated with coherence.
-