摘要:
A method, apparatus, and computer instructions are provided for a common cluster model for configuring, managing, and operating different clustering technologies in a data center. The common cluster model supports peer cluster domains and management server cluster domains. Each cluster domain may have one or more cluster nodes. For each cluster domains, one or more cluster resources may be defined. These resources may depend on one another and may be grouped into a resource group. A set of cluster domain and cluster resources logical device operations are provided to configure, manage, and operate cluster domains and its associated resources.
摘要:
A method, apparatus, and computer instructions are provided for expressing high availability (H/A) cluster demand based on probability of breach. When a failover occurs in the H/A cluster, event messages are sent to a provisioning manager server. The mechanism of embodiments of the present invention filters the event messages and translates the events into probability of breach data. The mechanism then updates the data model of the provision manager server and makes a recommendation to the provisioning manager server as to whether reprovisioning of new node should be performed. The provisioning manager server makes the decision and either reprovisions new nodes to the H/A cluster or notifies the administrator of detected poisoning problem.
摘要:
One aspect of the present invention provides a workflow model to effectively respond to outage events within an IT infrastructure. This workflow model enables a combination of manual and automated processing to effectively deploy a flexible, plannable, and testable recovery to outages and problems encountered within IT infrastructure settings. In one embodiment, a shared processing context is created to accompany the operations of the workflow, thereby collecting useful data in one location related to events and status information during the outage and the outage response. Within the workflow, analysis of the outage event is performed, an appropriate recovery plan is selected, the selected recovery plan is implemented, and recovery to the outage event is completed. Data collected within the processing context can be analyzed to obtain post mortem analysis and continuous service improvements. Accordingly, the improvements can be implemented within the IT infrastructure directly or within the appropriate recovery plan.
摘要:
One aspect of the present invention provides a workflow model to effectively respond to outage events within an IT infrastructure. This workflow model enables a combination of manual and automated processing to effectively deploy a flexible, plannable, and testable recovery to outages and problems encountered within IT infrastructure settings. In one embodiment, a shared processing context is created to accompany the operations of the workflow, thereby collecting useful data in one location related to events and status information during the outage and the outage response. Within the workflow, analysis of the outage event is performed, an appropriate recovery plan is selected, the selected recovery plan is implemented, and recovery to the outage event is completed. Data collected within the processing context can be analyzed to obtain post mortem analysis and continuous service improvements. Accordingly, the improvements can be implemented within the IT infrastructure directly or within the appropriate recovery plan.