摘要:
Entities within a cluster are uniquely identified with a node ID and an engine ID. The node ID uniquely identifies a node within a cluster of nodes and the engine ID uniquely identifies one of several engines included in the node. Entities may be further identified with a cluster ID, an engine type ID, and/or a virtual server ID. At least some of these IDs may be included in communications received from clients and used to route the communications to the cluster entity identified by the included IDs.
摘要:
A system may include a client and a distributed data manager coupled to the client. The distributed data manager may include a data store storing a data object that includes several sub-elements. The client is configured to update a portion of the data object by sending a message to the distributed data manager. The message specifies one of the sub-elements of the data object to be updated and includes a new value of that sub-element but does not include a new value of the entire data object. The distributed data manager is configured to perform updates to the data object in the data store dependent on which of the sub-elements of the data object are specified by the client.
摘要:
A distributed system provides for separate management of dynamic cluster membership and distributed data. Nodes of the distributed system may include a state manager and a topology manager. A state manager handles data access from the cluster. A topology manager handles changes to the dynamic cluster topology. The topology manager enables operation of the state manager by handling topology changes, such as new nodes to join the cluster and node members to exit the cluster. A topology manager may follow a static topology description when handling cluster topology changes. Data replication and recovery functions may be implemented, for example to provide high availability.
摘要:
A distributed system provides for separate management of dynamic cluster membership and distributed data. Nodes of the distributed system may include a state manager and a topology manager. A state manager handles data access from the cluster. A topology manager handles changes to the dynamic cluster topology. The topology manager enables operation of the state manager by handling topology changes, such as new nodes to join the cluster and node members to exit the cluster. A topology manager may follow a static topology description when handling cluster topology changes. Data replication and recovery functions may be implemented, for example to provide high availability.
摘要:
A method, system, and medium are disclosed for performing failover data replication with colocation of session state data. In servicing a client request, a first session is created on a primary server. A first portion of session data comprises a state of the first session and is stored on the primary server. An identifier of the first session is stored on the primary server. One or more backup servers are selected for backup of the first portion of session data. A second session is created on the primary server. A second portion of session data comprises a state of the second session and is stored on the primary server. The same backup server(s) are selected for backup of the second portion based on the stored identifier of the first session. The primary server replicates the first and second portions of session data into memory space of the backup servers.
摘要:
A method, system, and medium are disclosed for performing transparent failover in a cluster server system. The cluster includes a plurality of servers. In servicing a client request, a primary server replicates session data for the client into memory space of one or more backup servers. The primary server sends a response to the client, wherein the response includes an indication of the one or more backup servers. When the client sends a subsequent request, it includes an indication of the backup servers. If the primary server is unavailable, the cluster determines a recovery server from among the backup servers indicated by the request. The chosen recovery server would then service the request.
摘要:
A cluster topology self-healing process is performed in order to replicate a data set stored on a failed node from a first node storing another copy of the data set to a second non-failed node. The self-healing process is performed by: locking one of several domains included in the data set, where locking that domain does not lock any of the other domains in the data set; storing data sent from the first node to the second node in the domain; and releasing the domain. This process of locking, storing, and releasing is repeated for each other domain in the data set. Each domain may be locked for significantly less time than it takes to copy the entire data set. Accordingly, client access requests targeting a locked domain will be delayed for less time than if the entire data set is locked during the self-healing process.
摘要:
Data stored within a cluster may be distributed among nodes each storing a portion of the data. The data may be replicated wherein different nodes store copies of the same portion of the data. In response to detecting the failure of a node, the cluster may initiate a timeout period. If the node remains failed throughout the timeout period, the cluster may copy the portion of the data stored on the failed node onto one or more other nodes of the cluster. If the node returns to the cluster during the timeout period, the cluster may maintain the copy of the data on the previously failed node without copying the portion of the data stored on the failed node onto any other nodes. By delaying self-healing of the cluster for the timeout period, an unbalanced data distribution may be avoided in cases where a failed node quickly rejoins the cluster.
摘要:
A distributed system provides for separate management of dynamic cluster membership and distributed data. Nodes of the distributed system may include a state manager and a topology manager. A state manager handles data access from the cluster. A topology manager handles changes to the dynamic cluster topology. The topology manager enables operation of the state manager by handling topology changes, such as new nodes to join the cluster and node members to exit the cluster. A topology manager may follow a static topology description when handling cluster topology changes. Data replication and recovery functions may be implemented, for example to provide high availability.
摘要:
A method, system, and medium are disclosed for performing failover data replication with colocation of session state data. In servicing a client request, a first session is created on a primary server. A first portion of session data comprises a state of the first session and is stored on the primary server. An identifier of the first session is stored on the primary server. One or more backup servers are selected for backup of the first portion of session data. A second session is created on the primary server. A second portion of session data comprises a state of the second session and is stored on the primary server. The same backup server(s) are selected for backup of the second portion based on the stored identifier of the first session. The primary server replicates the first and second portions of session data into memory space of the backup servers.