-
公开(公告)号:US12282662B2
公开(公告)日:2025-04-22
申请号:US17898189
申请日:2022-08-29
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Thomas Edward McGee , Brian J. Johnson , Frank R. Dropps , Derek S. Schumacher , Stuart C. Haden , Michael S. Woodacre
IPC: G06F3/06 , G06F12/0817
Abstract: One aspect of the application can provide a system and method for replacing a failing node with a spare node in a non-uniform memory access (NUMA) system. During operation, in response to determining that a node-migration condition is met, the system can initialize a node controller of the spare node such that accesses to a memory local to the spare node are to be processed by the node controller, quiesce the failing node and the spare node to allow state information of processors on the failing node to be migrated to processors on the spare node, and subsequent to unquiescing the failing node and the spare node, migrate data from the failing node to the spare node while maintaining cache coherence in the NUMA system and while the NUMA system remains in operation, thereby facilitating continuous execution of processes previously executed on the failing node.
-
公开(公告)号:US20240069742A1
公开(公告)日:2024-02-29
申请号:US17898189
申请日:2022-08-29
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Thomas Edward McGee , Brian J. Johnson , Frank R. Dropps , Derek S. Schumacher , Stuart C. Haden , Michael S. Woodacre
IPC: G06F3/06 , G06F12/0817
CPC classification number: G06F3/0617 , G06F3/0647 , G06F3/0679 , G06F12/0828 , G06F2212/271 , G06F2212/621
Abstract: One aspect of the application can provide a system and method for replacing a failing node with a spare node in a non-uniform memory access (NUMA) system. During operation, in response to determining that a node-migration condition is met, the system can initialize a node controller of the spare node such that accesses to a memory local to the spare node are to be processed by the node controller, quiesce the failing node and the spare node to allow state information of processors on the failing node to be migrated to processors on the spare node, and subsequent to unquiescing the failing node and the spare node, migrate data from the failing node to the spare node while maintaining cache coherence in the NUMA system and while the NUMA system remains in operation, thereby facilitating continuous execution of processes previously executed on the failing node.
-