摘要:
A method, apparatus, system, and signal-bearing medium that, in an embodiment, send a broadcast message to a cluster of servers receive a point-to-point message from a coordinating server of the cluster, where the coordinating server joined the cluster before all other servers in the cluster. The point-to-point message includes routing data regarding all of the servers in the cluster. In an embodiment, the broadcast message includes a record that includes an identification of a new server, resource data regarding the new server, and a time that the new server joins the cluster, and the servers in the cluster add the record to the routing data and send a request to the new server via the record. In another embodiment, the broadcast message includes records for all servers in a second cluster, and the new server sends the routing data to the servers in the second cluster. If a server leaves the cluster, its record is removed. In this way, a cluster can respond to servers dynamically joining and leaving the cluster while reducing network traffic.
摘要:
An apparatus, program product and method utilize sets of attributes respectively associated with managed resources and policies to match managed resources with individual policies. Multiple managed resources are permitted to be matched with a specific policy, such that the policy applies to all matching managed resources. Furthermore, by providing multiple attributes upon which to match, policies are capable of being defined with varying degrees of specificity, enabling administrators to utilize more generic policies for certain types of managed resources, with more specific policies used to override certain managed resources whenever needed.
摘要:
An apparatus, program product and method enable program code that manages a managed resource, e.g., a high availability manager, to receive status information associated with an externally-managed resource such that the program code can properly apply an activation policy to the managed resource in a manner that is consistent with any requirements placed upon that resource by the externally-managed resource. Where, for example, a managed resource is required to be collocated on the same node or computer as an externally-managed resource, the status information may include location information that identifies where the externally-managed resource is currently active, such that the program code can activate the managed resource on the same node as the externally-managed resource.
摘要:
An application server includes a connection pool that specifies a number of allowable connections, and includes a backend failure detection mechanism and a backend failure recovery mechanism. When the backend failure detection mechanism detects that the backend fails, applications waiting on the hung connections may be notified of the backend failure. The backend failure detection mechanism will then detect when the backend recovers and becomes available once again. Once the backend is available again, the backend failure recovery mechanism increases the number of connections in the connection pool to compensate for the hung connections. As each hung connection is timed out using a network timeout mechanism, the number of allowable connections is reduced. Eventually all of the hung connections will time out, with the result being that the connection pool will contain the same specified number of allowable connections it originally had before the backend failed.
摘要:
In a method and system for monitoring events occurring at respective servers of a configuration of nodes, a first server located at a first node receives information from a messaging system pertaining to events at servers located at other nodes. The messaging system usefully comprises a highly available (HA) bulletin board or the like. When the first server receives a start event notification pertaining to a second server located at a second node, a direct communication path is established between the first and second servers. The first server identifies events in the second server that affect or are of interest to services of the first server. The first server then registers with the second server, to receive notification through the direct communication path when respective identified events occur.
摘要:
A method, apparatus, system, and signal-bearing medium that in an embodiment enforce ordering of messages sent from a queue to clients. If a total order indicator is on for a queue associated with a get message request, the next message is sent from the queue to the client if the queue does not have an associated in-doubt transaction. An in-doubt transaction may be a transaction for which the client has not received a commit request. In another embodiment, an authorized client is selected and messages are only sent from the queue to the authorized client.
摘要:
Methods, apparatuses, and products are disclosed for session replication that include enqueueing sessions on a replication queue and flushing enqueued sessions, from the replication queue to a replication peer, in dependence upon flushing criteria, for storage on a replication medium. The replication medium may be non-volatile storage in a database or remote random access memory. Flushing may be carried out periodically or in dependence upon replication queue depth. Flushing may include aggregating sessions from the replication queue for transmission to the replication peer.
摘要:
Method and apparatus, in a data processing system that uses JMX as a mechanism for managing internal components, for processing JMX requests to a managed group that includes a plurality of group members. When a JMX request is transmitted to a first member of a group comprising a plurality of group members, a determination is made if the first member is in an active state capable of processing the request. If the first member is not in an active state, the JMX request is forwarded to a currently active member of the plurality of members for processing the request. The invention thus permits users to communicate with the group via JMX without knowing which member of the group is active at any particular time.
摘要:
An apparatus and method provide a quorum-based server power-down mechanism that allows a manager in a computer cluster to power-down unresponsive servers in a manner that assures that an unresponsive server does not become responsive again. In order for a manager in a cluster to power down servers in the cluster, the cluster must have quorum, meaning that a majority of the computers in the cluster must be responsive. If the cluster has quorum, and if the manager server did not fail, the manager causes the failed server(s) to be powered down. If the manager server did fail, the new manager causes all unresponsive servers in the cluster to be powered down. If the power-down is successful, the resources on the failed server(s) may be failed over to other servers in the cluster that were not powered down. If the power-down is not successful, the cluster is disabled.
摘要:
A method, apparatus, system, and signal-bearing medium that, in an embodiment, receive remote procedure calls that request data transfers between a first memory allocated to a first logical partition and a second memory shared among multiple logical partitions. If the first memory and the second memory are accessed via addresses of different sizes, the data is copied between the first memory and the second memory. Further, the data is periodically copied between the second memory and network attached storage.