摘要:
Incremental single and multiprocess checkpointing and restoration is described, which is transparent in that the application program need not be modified, re-compiled, or re-linked to gain the benefits of the invention. The processes subject to checkpointing can be either single or multi-threaded. The method includes incremental page-boundary checkpointing, as well as storage checkpointing of data files associated with applications to ensure correct restoration without the need to restore files for other application programs. Incremental and full checkpoints are asynchronously merged to ensure proper operation while reducing checkpointing delay. By way of example a user-level programming library is described for loading into the address space of the application in conjunction with a loadable kernel module (LKM) or device driver used to capture and restore process state on behalf of the application. These techniques are particularly well suited for use with high-availability (HA) protection programming.
摘要:
A mirror destination storage server receives mirror update data streams from several mirror source storage servers. Data received from each mirror is cached and periodic checkpoints are queued, but the data is not committed to long-term storage at the mirror destination storage server immediately. Instead, the data remains in cache memory until a trigger event causes the cache to be flushed to a mass storage device. The trigger event is asynchronous with respect to packets of at least one of the data streams. In one embodiment, the trigger event is asynchronous with respect to packets of all of the data streams.
摘要:
Incremental single and multiprocess checkpointing and restoration is described, which is transparent in that the application program need not be modified, re-compiled, or re-linked to gain the benefits of the invention. The processes subject to checkpointing can be either single or multi-threaded. The method includes incremental page-boundary checkpointing, as well as storage checkpointing of data files associated with applications to ensure correct restoration without the need to restore files for other application programs. Incremental and full checkpoints are asynchronously merged to ensure proper operation while reducing checkpointing delay. By way of example a user-level programming library is described for loading into the address space of the application in conjunction with a loadable kernel module (LKM) or device driver used to capture and restore process state on behalf of the application. These techniques are particularly well suited for use with high-availability (HA) protection programming.