摘要:
Illustrated is a system and method for executing a checkpoint scheme as part of processing a workload using an application. The system and method also includes identifying a checkpoint event that requires an additional checkpoint scheme. The system and method includes retrieving checkpoint data associated with the checkpoint event. It also includes building a checkpoint model based upon the checkpoint data. The system and method further includes identifying the additional checkpoint scheme, based upon the checkpoint model, the additional checkpoint scheme to be executed as part of the processing of the workload using the application.
摘要:
Illustrated is a system and method for executing a checkpoint scheme as part of processing a workload using an application. The system and method also includes identifying a checkpoint event that requires an additional checkpoint scheme. The system and method includes retrieving checkpoint data associated with the checkpoint event. It also includes building a checkpoint model based upon the checkpoint data. The system and method further includes identifying the additional checkpoint scheme, based upon the checkpoint model, the additional checkpoint scheme to be executed as part of the processing of the workload using the application.
摘要:
A method and apparatus for transparent failover of a filesystem within a computer cluster is provided. For failover protection, a filesystem is physically connected to an active server node and a standby server node. A cluster file system provides distributed access to the filesystem throughout the computer cluster. The cluster file system monitors the progress of each operation performed on the failover protected filesystem. If the active server node should fail during an operation, all processes performing operations on the failover protected filesystem are caused to sleep. The filesystem is then relocated to the standby server node. The cluster file system then awakens each sleeping process and retries each pending operation.