-
公开(公告)号:US20180020054A1
公开(公告)日:2018-01-18
申请号:US15650296
申请日:2017-07-14
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Michael Woodacre , Randal S. Passint
IPC: H04L29/08 , H04L12/947
CPC classification number: H04L67/1097 , G06F12/0817 , G06F12/0831 , G06F2212/224 , G06F2212/621 , H04L49/25 , H04L49/358 , H04L67/2842
Abstract: An apparatus and method exchange data between two nodes of a high performance computing (HPC) system using a data communication link. The apparatus has one or more processing cores, RDMA engines, cache coherence engines, and multiplexers. The multiplexers may be programmed by a user application, for example through an API, to selectively couple either the RDMA engines, cache coherence engines, or a mix of these to the data communication link. Bulk data transfer to the nodes of the HPC system may be performed using paged RDMA during initialization. Then, during computation proper, random access to remote data may be performed using a coherence protocol (e.g. MESI) that operates on much smaller cache lines.
-
公开(公告)号:US20180018196A1
公开(公告)日:2018-01-18
申请号:US15650357
申请日:2017-07-14
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Steven J. Dean , Michael Woodacre , Randal S. Passint , Eric C. Fromm , Thomas E. McGee , Michael E. Malewicki , Kirill Malkin
CPC classification number: G06F9/45558 , G06F9/50 , G06F9/5077 , G06F9/54 , G06F2009/45595 , G06Q10/06 , H04L29/08315 , H04L67/1042
Abstract: A high performance computing (HPC) system has an architecture that separates data paths used by compute nodes exchanging computational data from the data paths used by compute nodes to obtain computational work units and save completed computations. The system enables an improved method of saving checkpoint data, and an improved method of using an analysis of the saved data to assign particular computational work units to particular compute nodes. The system includes a compute fabric and compute nodes that cooperatively perform a computation by mutual communication using the compute fabric. The system also includes a local data fabric that is coupled to the compute nodes, a memory, and a data node. The data node is configured to retrieve data for the computation from an external bulk data storage, and to store its work units in the memory for access by the compute nodes.
-
公开(公告)号:US10521260B2
公开(公告)日:2019-12-31
申请号:US15650357
申请日:2017-07-14
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Steven J. Dean , Michael Woodacre , Randal S. Passint , Eric C. Fromm , Thomas E. McGee , Michael E. Malewicki , Kirill Malkin
Abstract: A high performance computing (HPC) system has an architecture that separates data paths used by compute nodes exchanging computational data from the data paths used by compute nodes to obtain computational work units and save completed computations. The system enables an improved method of saving checkpoint data, and an improved method of using an analysis of the saved data to assign particular computational work units to particular compute nodes. The system includes a compute fabric and compute nodes that cooperatively perform a computation by mutual communication using the compute fabric. The system also includes a local data fabric that is coupled to the compute nodes, a memory, and a data node. The data node is configured to retrieve data for the computation from an external bulk data storage, and to store its work units in the memory for access by the compute nodes.
-
公开(公告)号:US10404800B2
公开(公告)日:2019-09-03
申请号:US15650296
申请日:2017-07-14
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Michael Woodacre , Randal S. Passint
IPC: G06F12/00 , H04L29/08 , G06F12/0817 , G06F12/0831 , H04L12/947 , H04L12/931
Abstract: An apparatus and method exchange data between two nodes of a high performance computing (HPC) system using a data communication link. The apparatus has one or more processing cores, RDMA engines, cache coherence engines, and multiplexers. The multiplexers may be programmed by a user application, for example through an API, to selectively couple either the RDMA engines, cache coherence engines, or a mix of these to the data communication link. Bulk data transfer to the nodes of the HPC system may be performed using paged RDMA during initialization. Then, during computation proper, random access to remote data may be performed using a coherence protocol (e.g. MESI) that operates on much smaller cache lines.
-
-
-