Invention Grant
- Patent Title: Method and apparatus for providing distributed checkpointing
-
Application No.: US15207943Application Date: 2016-07-12
-
Publication No.: US10073746B2Publication Date: 2018-09-11
- Inventor: Sergey Blagodurov , Taniya Siddiqua , Vilas Sridharan
- Applicant: Advanced Micro Devices, Inc.
- Applicant Address: US CA Santa Clara
- Assignee: Advanced Micro Devices, Inc.
- Current Assignee: Advanced Micro Devices, Inc.
- Current Assignee Address: US CA Santa Clara
- Agency: Faegre Baker Daniels LLP
- Main IPC: G06F11/00
- IPC: G06F11/00 ; G06F11/14 ; G06F3/06

Abstract:
Methods and apparatus presented herein provide distributed checkpointing in a multi-node system, such as a network of servers in a data center. When checkpointing of application state data is needed in a node, the methods and apparatus determine whether checkpoint memory space is available in the node for checkpointing the application state data. If not enough checkpoint memory space is available in the node, the methods and apparatus request and find additional checkpoint memory space from other nodes in the system. In this manner, the methods and apparatus can checkpoint the application state data into available checkpoint memory spaces distributed among a plurality of nodes. This allows for high bandwidth and low latency checkpointing operations in the multi-node system.
Public/Granted literature
- US20180018242A1 METHOD AND APPARATUS FOR PROVIDING DISTRIBUTED CHECKPOINTING Public/Granted day:2018-01-18
Information query