摘要:
A method and system for increasing server cluster availability by requiring at a minimum only one node and a quorum replica set of replica members to form and operate a cluster. Replica members, independent from the nodes, maintain cluster operational data. A cluster operates when one node possesses a majority of replica members, which ensures that any new or surviving cluster includes consistent cluster operational data via at least one replica member from the immediately prior cluster. Arbitration provides exclusive ownership by one node of the replica members, including at cluster formation, and when the owning node fails. Arbitration uses a fast mutual exclusion algorithm and a reservation mechanism to challenge for and defend the exclusive reservation of each member. A quorum replica set algorithm brings members online and offline with data consistency, including updating unreconciled replica members, and ensures consistent read and update operations.
摘要:
A method and system wherein following a partitioning of a server cluster, operational subgroups arbitrate for possession of a quorum resource that determines cluster representation, wherein the arbitration is biased by a relative weight of the subgroup. The weight may be relative to the original cluster weight, or submitted as a bid that is relative to other possible subgroup weights. The biasing gives subgroups that are better capable of representing the cluster an arbitration advantage over lesser subgroups. The biasing weight of each subgroup may be determined by node count and/or by a calculation of the subgroup's resources. The arbitration may be delayed based on the relative weight, or alternatively, the arbitration may comprise a bidding process in which a subgroup's bid is based on the subgroup's relative weight.
摘要:
A method and system for increasing server cluster availability by requiring at a minimum only one node and a quorum replica set of replica members to form and operate a cluster. Replica members, independent from the nodes, maintain cluster operational data. A cluster operates when one node possesses a majority of replica members, which ensures that any new or surviving cluster includes consistent cluster operational data via at least one replica member from the immediately prior cluster. Arbitration provides exclusive ownership by one node of the replica members, including at cluster formation, and when the owning node fails. Arbitration uses a fast mutual exclusion algorithm and a reservation mechanism to challenge for and defend the exclusive reservation of each member. A quorum replica set algorithm brings members online and offline with data consistency, including updating unreconciled replica members, and ensures consistent read and update operations.
摘要:
A method and system for increasing server cluster availability by requiring at a minimum only one node and a quorum replica set of replica members to form and operate a cluster. Replica members maintain cluster operational data. A cluster operates when one node possesses a majority of replica members, which ensures that any new or surviving cluster includes consistent cluster operational data via at least one replica member from the immediately prior cluster. Arbitration provides exclusive ownership by one node of the replica members, including at cluster formation, and when the owning node fails. Arbitration uses a fast mutual exclusion algorithm and a reservation mechanism to challenge for and defend the exclusive reservation of each member. A quorum replica set algorithm brings members online and offline with data consistency, including updating unreconciled replica members, and ensures consistent read and update operations.
摘要:
A method and system for communicating modification information to servers in a server cluster. Local changes, such as modifications to a resource requested at one node, are associated into a single transaction. A master node, such as the node that owns the set of resources corresponding to the modifications in the transaction requests permission from a locker node to replicate the transaction. When permission to replicate the transaction is received from the locker node, the master node replicates the transaction by requesting each node in the cluster, one node at a time, to commit the transaction. Any node that does not commit the transaction is removed from the cluster, ensuring consistency of the cluster. Failure conditions of any node or nodes are also handled in a manner that ensures consistency.
摘要:
A method and system for distributing various types of cluster data among various storage devices of a server cluster. Cluster core boot data that is needed to get the cluster up and running is stored on a quorum storage mechanism, separate from cluster configuration data which is stored on lower cost and/or higher performance storage. The quorum storage may be implemented via a quorum of nodes, a single quorum disk or a quorum of replica members. The state of the cluster configuration data, as well as the state of other cluster data, may be stored on the quorum storage, thereby assuring the integrity of the data while providing increased reliability through the use of mirror sets of storage elements or the like for storing that data. Significant flexibility in how a cluster may be configured is achieved, along with improved cluster performance and scalability.
摘要:
A method and system for selecting a set of systems (nodes) for a server cluster from at least two non-communicating sets of systems. A persistent storage device with cluster configuration information therein is provided as a quorum resource. Using an arbitration process, only one system exclusively reserves the quorum resource. The set with the system therein having the exclusive reservation of the quorum device is selected as the cluster. The arbitration process provides a challenge-defense protocol whereby a system can obtain the reservation of the quorum device when the system that has the reservation fails.
摘要:
A method and system for providing remote access and control of devices such as disks, tape drives and modems across a network. A client driver intercepts I/O requests that are destined for a device which an application essentially considers a local device, such as for purposes of disk mirroring. The client driver queues and dequeues the I/O request, marshals it with header information and data, and sends it as a message to the server over one of possibly multiple connections to the server. A server driver unmarshalls the message, places it in a preallocated buffer as designated by the client, and generates an I/O request therefrom directed to the server device. The server responds with a completion status. The client side manages the server buffers, and the client classifies and partitions large requests into one or more segments that fit the server buffers. Sequential processing also may be ensured. The client also handles cancel operations on the remote device, and the client may also load balance across the multiple paths, by selecting a connection based on criteria including pending message size and dynamic performance measurements of the connections.
摘要:
A method and system for increasing the availability of a server cluster while reducing its cost by requiring at a minimum only one node and a quorum replica set of storage devices (replica members) to form and continue operating as a cluster. A plurality of replica members maintain the cluster operational data and are independent from any given node. A cluster may be formed and continue to operate as long as one server node possesses a quorum (majority) of the replica members. This ensures that a new or surviving cluster has a least one replica member that belonged to the immediately prior cluster and is thus correct with respect to the cluster operational data. Update sequence numbers and/or timestamps are used to determine the most updated replica member from among those in the quorum for reconciling the other replica members.
摘要:
A method and system in a server cluster for monitoring and controlling a resource object, such as a physical device or application. A cluster service connects to a resource monitoring component to control and monitor the health of one or more resource objects. The resource component includes a plurality of methods, common to all such resource components, for calling by the resource monitor to control and monitor operation of the resource object therethrough. The common methods enable the cluster server to treat all resources similarly without regard to the type of resource.