摘要:
Methods, systems and articles of manufacture for automatically starting a node in a clustered computer system. A starting state value may be assigned to the node and a discovery process initiated to find a sponsor node. If a sponsor node is found, the node is joined with the sponsor node in the clustered computer system. If a sponsor node is not found, the node is started as a one-node cluster in the clustered computer system. An active state value is assigned to the node upon inclusion into the clustered computer system.
摘要:
A clustered computer system, apparatus, program product and method utilize a group member-initiated shutdown process to terminate clustering on a node in an automated and orderly fashion, typically in the event of a failure detected by a group member residing on that node. As a component of such a process, node leave operations are initiated on the other nodes in a clustered computer system, thereby permitting any dependency failovers to occur in an automated fashion. Moreover, other group members on a node to be shutdown are preemptively terminated prior to local detection of the failure within those other group members, so that termination of clustering on the node may be initiated to complete a shutdown operation.
摘要:
An apparatus, program product and method support the dynamic modification of cluster communication parameters such as a fragmentation size parameter through controllably deferring the processing of a requested fragmentation size change in a source node until after receipt an acknowledgment message for at least one unacknowledged message sent by the source node to a plurality of target nodes. By controllably deferring such processing until it is confirmed that any such previously-unacknowledged messages sent by a source node have been received by any target nodes, synchronization between the source node and the target nodes may be obtained, and a fragmentation size change may occur in a coordinated fashion such that future messages from the source node to the target node will be processed by both the source and the target nodes using the modified fragmentation size parameter.
摘要:
An apparatus, program product and method support the dynamic modification of cluster communication parameters through a distributed protocol whereby individual nodes locally confirm initiation and status information for every node participating in a parameter modification operation. By doing so, individual nodes are also able to locally determine the need to undo locally-performed parameter modifications should any other node be incapable of performing a parameter modification. Moreover, specifically with respect to cluster communication parameters such as heartbeat parameters, such parameters may be dynamically modified by configuring a sending node to send a heartbeat message to a receiving node, with the heartbeat message indicating that a heartbeat parameter is to be modified. In response to the heartbeat message, the receiving node may then send an acknowledgment message to the sending node that indicates whether the heartbeat parameter has been modified in the receiving node. Further, modification of the heartbeat parameter in the sending node may be deferred until the acknowledgment message from the receiving node indicates that the heartbeat parameter has been modified in the receiving node.
摘要:
An apparatus, program product and method support the dynamic modification of cluster communication parameters such as a fragmentation size parameter through controllably deferring the processing of a requested fragmentation size change in a source node until after receipt an acknowledgment message for at least one unacknowledged message sent by the source node to a plurality of target nodes. By controllably deferring such processing until it is confirmed that any such previously-unacknowledged messages sent by a source node have been received by any target nodes, synchronization between the source node and the target nodes may be obtained, and a fragmentation size change may occur in a coordinated fashion such that future messages from the source node to the target node will be processed by both the source and the target nodes using the modified fragmentation size parameter.
摘要:
According to the present invention, a communications protocol supporting cluster configurations more complex than a single LAN is disclosed. A cluster destination address table (CDAT) is used in conjunction with a network message servicer to communicate between computer systems in a cluster. Each computer system preferably contains a cluster servicer, a CDAT, and a network message servicer. The CDAT contains network addresses, status and adapter information for each computer system in a cluster. Although computer systems may have alternate network addresses when they have multiple adapters, the CDAT indexes primary and alternate address information under a single named system. Thus, redundant connections amongst computer systems are identified, while still using the numeric addresses upon which the network message servicer is based. To send a message using the methods of the present invention, the cluster servicer retrieves a network address for a computer system from a CDAT. A message to be sent and the retrieved address are passed to the network message servicer, preferably an Internet Protocol suite. The network message servicer formats the information into a packet and routes the packet.
摘要:
A method for assigning an identifier to data processed through protocol layers in one or more computers over a network. A space for the identifier is reserved in the header of each protocol layer. The identifier is then generated at one of the protocol layers. In an embodiment, the identifier is generated at the lowest protocol layer of a computer that sends the data, i.e., the sending computer. Once the identifier is generated, it is then stored in the reserved space in the header.
摘要:
A method and apparatus are provided for implementing system to system communication in a switchless non-InfiniBand (IB) compliant environment. IB architected multicast facilities are used to communicate between HCAs in a loop or string topology. Multiple HCAs in the network subscribe to a predetermined multicast address. Multicast messages sent by one HCA destined to the pre-determined multicast address are received by other HCAs in the network. Intermediate TCA hardware, per IB architected multicast support, forward the multicast messages on via hardware facilities, which do not require invocation of software facilities thereby providing performance efficiencies. The messages flow until picked up by an HCA on the network. Architected higher level IB connections, such as IB supported Reliable Connections (RCs) are established using the multicast message flow, eliminating the need for an IB Subnet Manager (SM).
摘要:
An apparatus, program product and method utilize cluster data port services within a cluster infrastructure to provide reliable and efficient communications between nodes in a clustered computer system. The cluster data port services present an abstracted transport service that encapsulates and manages the establishment of multiple connection paths between a source node, a target node and one or more backup nodes in such a manner that a cluster data port is effectively utilized as single data port from the perspective of a user program.
摘要:
According to the present invention, a cluster communications system is provided that supports reliable and efficient cluster communications. The preferred embodiment cluster communication systems can be used to provide this reliable and efficient cluster communication for cluster configurations extending beyond a single local area network (LAN). The cluster communications system provides reliable and efficient cluster communication by facilitating multicast messaging between systems in the cluster. In particular, the preferred embodiment provides for the establishment of multicast groups in between which multicast messaging is provided. The preferred embodiment provides this multicasting while providing the needed mechanisms to assure ordered message delivery between systems. The preferred embodiment extends this efficient and reliable cluster communication by providing for additional point-to-point communication between systems not on the same LAN. Thus, the preferred embodiment provides a cluster communication system that uses reliable multicasting for efficient cluster communication in a way that can be used for clusters that extend beyond a single local area network.