摘要:
An embodiment of the invention is a method for proactive failover using user-defined rules. An event log of a first server node is monitored to check for user-specified application events. One of the user-specified application events corresponding to an impending failure in an application running on a first server node is detected. In automatic response to the detected impending failure, a proactive failover process is executed to transfer the application to a second server node for continued execution, the second server node being connected to the first server node in a cluster.
摘要:
The method of the present invention is useful in a computer system including at least two server nodes, each of which can execute clustered server software. The program executes a method for monitoring failure situations to reduce downtime. The method includes the step of detecting an event causing one of the failure situations, and then the method determines if the event affects one of the server nodes. If it is determined the event does affect one of the server nodes, the method then determines if the event exceeds a threshold value. If it is determined the event exceeds a threshold value, the method executes a proactive failover. If the event is not specific to a cluster node, but indicates an impending or actual failure of the cluster software, the method identifies and initiates an appropriate action to fix the condition or provide a workaround (if available) that will preempt an impending failure of the cluster system or would enable a restarting of a failed cluster software.