摘要:
Embodiments comprise a plurality of computing devices that dynamically intercept process application I/O errors. Various embodiments comprise two or more computing devices, such as two or more servers, each having access to a shared data storage system. An application may be executing on the first computing device and performing an I/O operation when an I/O error occurs. The first computing device may intercept the I/O error, rather than passing it back to the application, and prevent the error from affecting the application. The first computing device may complete the I/O operation, and any other pending I/O operations not written to disk, via an alternate path, perform a checkpoint operation to capture the state of the set of processes associated with the application, and transfer the checkpoint image to the second computing device. The second computing device may resume operation of the application from the checkpoint image.
摘要:
Embodiments comprise a plurality of computing devices that dynamically intercept process application I/O errors. Various embodiments comprise two or more computing devices, such as two or more servers, each having access to a shared data storage system. An application may be executing on the first computing device and performing an I/O operation when an I/O error occurs. The first computing device may intercept the I/O error, rather than passing it back to the application, and prevent the error from affecting the application. The first computing device may complete the I/O operation, and any other pending I/O operations not written to disk, via an alternate path, perform a checkpoint operation to capture the state of the set of processes associated with the application, and transfer the checkpoint image to the second computing device. The second computing device may resume operation of the application from the checkpoint image.
摘要:
Embodiments that generate checkpoint images of an application for use as warm standby are contemplated. The embodiments may monitor accesses of external references by threads. An external reference may comprise a connection or use of services of an entity that is external to the set of processes that constitute the application, to which a process of the application attempts to connect by means of a socket or inter-process communication (IPC). Various embodiments comprise two or more computing devices, such as two or more servers. One of the computing devices may generate a checkpoint image of an application at a suitable point in time during initialization, when the state of the application is not yet dependent on interactions with external references. The second computing device may preload checkpoint image for the application and activate the checkpoint images when needed, following the specific resource management rules of the distributed subsystem.
摘要:
Embodiments that generate checkpoint images of an application for use as warm standby are contemplated. The embodiments may monitor accesses of external references by threads. An external reference may comprise a connection or use of services of an entity that is external to the set of processes that constitute the application, to which a process of the application attempts to connect by means of a socket or inter-process communication (IPC). Various embodiments comprise two or more computing devices, such as two or more servers. One of the computing devices may generate a checkpoint image of an application at a suitable point in time during initialization, when the state of the application is not yet dependent on interactions with external references. The second computing device may preload checkpoint image for the application and activate the checkpoint images when needed, following the specific resource management rules of the distributed subsystem.