-
公开(公告)号:US20230229524A1
公开(公告)日:2023-07-20
申请号:US17578255
申请日:2022-01-18
Applicant: NVIDIA Corporation
Inventor: Glenn Alan Dearth , Mark Hummel , Daniel Joseph Lustig
CPC classification number: G06F9/522 , G06F9/4881 , G06F9/3004
Abstract: In various examples, a single notification (e.g., a request for a memory access operation) that a processing element (PE) has reached a synchronization barrier may be propagated to multiple physical addresses (PAs) and/or devices associated with multiple processing elements. Thus, the notification may allow an indication that the processing element has reached the synchronization barrier to be recoded at multiple targets. Each notification may access the PAs of each PE and/or device of a barrier group to update a corresponding counter. The PEs and/or devices may poll or otherwise use the counter to determine when each PE of the group has reached the synchronization barrier. When a corresponding counter indicates synchronization at the synchronization barrier, a PE may proceed with performing a compute task asynchronously with one or more other PEs until a subsequent synchronization barrier may be reached.
-
公开(公告)号:US20250123966A1
公开(公告)日:2025-04-17
申请号:US18381545
申请日:2023-10-18
Applicant: NVIDIA Corporation
Inventor: Harold Carter Edwards , Daniel Joseph Lustig , Gonzalo Brito Gadeschi , Subhasmita Chakraborty , Gokul Ramaswamy Hirisave Chandra Shekhara
IPC: G06F12/0811 , G06F12/0804
Abstract: Apparatuses, systems, and techniques to prevent information from being read from a second cache location while information is being stored in a first cache location. In at least one embodiment, one or more circuits are to perform an application programming interface (API) to prevent information from being read from a second cache location while information is being stored in a first cache location.
-
公开(公告)号:US20250123969A1
公开(公告)日:2025-04-17
申请号:US18381559
申请日:2023-10-18
Applicant: NVIDIA Corporation
Inventor: Harold Carter Edwards , Daniel Joseph Lustig , Gonzalo Brito Gadeschi , Subhasmita Chakraborty , Gokul Ramaswamy Hirisave Chandra Shekhara
IPC: G06F12/0891 , G06F12/0811
Abstract: Apparatuses, systems, and techniques to cause information to be invalidated in a second cache location after information is stored in a first cache location. In at least one embodiment, one or more circuits are to perform an application programming interface (API) to cause information to be invalidated in a second cache location after information is stored in a first cache location.
-
公开(公告)号:US20230229599A1
公开(公告)日:2023-07-20
申请号:US17578266
申请日:2022-01-18
Applicant: NVIDIA Corporation
Inventor: Glenn Alan Dearth , Mark Hummel , Daniel Joseph Lustig
IPC: G06F12/1045 , G06F12/02 , G06F13/16
CPC classification number: G06F12/1063 , G06F12/1054 , G06F12/0238 , G06F13/1668
Abstract: In various examples, a memory model may support multicasting where a single request for a memory access operation may be propagated to multiple physical addresses associated with multiple processing elements (e.g., corresponding to respective local memory). Thus, the request may cause data to be read from and/or written to memory for each of the processing elements. In some examples, a memory model exposes multicasting to processes. This may include providing for separate multicast and unicast instructions or shared instructions with one or more parameters (e.g., indicating a virtual address) being used to indicate multicasting or unicasting. Additionally or alternatively, whether a request(s) is processed using multicasting or unicasting may be opaque to a process and/or application or may otherwise be determined by the system. One or more constraints may be imposed on processing requests using multicasting to maintain a coherent memory interface.
-
-
-