Abstract:
A high performance computing (HPC) system includes computing blades having a first region that includes computing circuit boards having processors for performing a computation, and a second region that includes non-volatile memory for use in performing the computation. The regions are connected by a plurality of power connectors that convey power from the computing circuit boards to the memory, and a plurality of data connectors that convey data between the first and second regions. The power and data connectors are configured redundantly so that failure of a computing circuit board, a power connector, or a data connector does not interrupt the computation. A method of performing such a computation, and a computer program product implementing the method, are also disclosed.
Abstract:
A high performance computing (HPC) system includes computing blades having a first region that includes processors for performing a computation, and a second region that includes non-volatile memory for use in performing the computation and another computing processor for performing data movement and storage. Because data movement and storage are offloaded to the secondary processor, the processors for performing the computation are not interrupted to perform these tasks. A method for use in the HPC system receives instructions in the computing processors and first data in the memory. The method includes receiving second data into the memory while continuing to execute the instructions in the computing processors, without interruption. A computer program product implementing the method is also disclosed.
Abstract:
A high performance computing (HPC) system includes computing blades having a first region that includes computing circuit boards having processors for performing a computation, and a second region that includes non-volatile memory for use in performing the computation. The regions are connected by a plurality of power connectors that convey power from the computing circuit boards to the memory, and a plurality of data connectors that convey data between the first and second regions. The power and data connectors are configured redundantly so that failure of a computing circuit board, a power connector, or a data connector does not interrupt the computation. A method of performing such a computation, and a computer program product implementing the method, are also disclosed.
Abstract:
A high performance computing (HPC) system includes computing blades having a first region that includes processors for performing a computation, and a second region that includes non-volatile memory for use in performing the computation and another computing processor for performing data movement and storage. Because data movement and storage are offloaded to the secondary processor, the processors for performing the computation are not interrupted to perform these tasks. A method for use in the HPC system receives instructions in the computing processors and first data in the memory. The method includes receiving second data into the memory while continuing to execute the instructions in the computing processors, without interruption. A computer program product implementing the method is also disclosed.