摘要:
A clustered computer system includes multiple computer systems (or nodes) coupled together via one or more networks that can become members of a group to work on a particular task. Each node includes a cluster engine, a cluster communication mechanism that includes a sliding send window, and one or more service tasks that process messages. The sliding send window allows a node to send out multiple messages without waiting for an individual acknowledgment to each message. The sliding send window also allows a node that received the multiple messages to send a single acknowledge message for multiple received messages. By using a sliding send window to communicate with other computer systems in the cluster, the communication traffic in the cluster is greatly reduced, thereby enhancing the overall performance of the cluster. In addition, the latency between multiple messages sent concurrently is dramatically reduced.
摘要:
According to the present invention, a communications protocol supporting cluster configurations more complex than a single LAN is disclosed. A cluster destination address table (CDAT) is used in conjunction with a network message servicer to communicate between computer systems in a cluster. Each computer system preferably contains a cluster servicer, a CDAT, and a network message servicer. The CDAT contains network addresses, status and adapter information for each computer system in a cluster. Although computer systems may have alternate network addresses when they have multiple adapters, the CDAT indexes primary and alternate address information under a single named system. Thus, redundant connections amongst computer systems are identified, while still using the numeric addresses upon which the network message servicer is based. To send a message using the methods of the present invention, the cluster servicer retrieves a network address for a computer system from a CDAT. A message to be sent and the retrieved address are passed to the network message servicer, preferably an Internet Protocol suite. The network message servicer formats the information into a packet and routes the packet.
摘要:
In a clustered computer system with multiple power domains, a bus number manager within each power domain manages multiple nodes independently of other power domains. A node within a specified power domain includes a non-volatile memory that includes bus numbering information for its own buses as well as bus numbering information for two of its logically-interconnected neighbors. This creates a distributed database of the interconnection topology for each power domain. Because a node contains bus numbering information about its logical neighbor node(s), the bus numbers for the buses in the nodes are made persistent across numerous different system reconfigurations. The clustered computer system also includes a bus number manager that reads the non-volatile memories in the nodes during initial program load (i.e., boot) that reconstructs the interconnection topology from the information read from the non-volatile memories, and that assigns bus numbers to the buses according to the derived interconnection topology.
摘要:
An apparatus, program product and method support the dynamic modification of cluster communication parameters through a distributed protocol whereby individual nodes locally confirm initiation and status information for every node participating in a parameter modification operation. By doing so, individual nodes are also able to locally determine the need to undo locally-performed parameter modifications should any other node be incapable of performing a parameter modification. Moreover, specifically with respect to cluster communication parameters such as heartbeat parameters, such parameters may be dynamically modified by configuring a sending node to send a heartbeat message to a receiving node, with the heartbeat message indicating that a heartbeat parameter is to be modified. In response to the heartbeat message, the receiving node may then send an acknowledgment message to the sending node that indicates whether the heartbeat parameter has been modified in the receiving node. Further, modification of the heartbeat parameter in the sending node may be deferred until the acknowledgment message from the receiving node indicates that the heartbeat parameter has been modified in the receiving node.
摘要:
The preferred embodiment of the present invention provides a cluster node distress system and method that improves the reliability of a cluster. The cluster node distress system provides a cluster node distress signal when a node on the cluster is about to fail. This allows the cluster to better to determine whether a non-communicating node has failed or has merely been partitioned from the cluster. The preferred cluster node distress system is embedded deeply into the operating system and provides a pre-built node distress signal that can be quickly sent to other nodes in the cluster when an imminent failure of that node is detected, improving the probability that the node distress signal will get out before the node totally fails. When the node distress signal is effectively sent to other nodes in the cluster, the cluster can accurately determine that the node has failed and has not just partitioned from the cluster. This allows the cluster to respond correctly, i.e., by assigning other nodes primary responsibility, with less intervention needed by administrators.
摘要:
In a clustered computer system with multiple power domains, a bus number manager within each power domain manages multiple nodes independently of other power domains. A node within a specified power domain includes a non-volatile memory that includes bus numbering information for its own buses as well as bus numbering information for two of its logically-interconnected neighbors. This creates a distributed database of the interconnection topology for each power domain. Because a node contains bus numbering information about its logical neighbor node(s), the bus numbers for the buses in the nodes are made persistent across numerous different system reconfigurations. The clustered computer system also includes a bus number manager that reads the non-volatile memories in the nodes during initial program load (i.e., boot) that reconstructs the interconnection topology from the information read from the non-volatile memories, and that assigns bus numbers to the buses according to the derived interconnection topology.
摘要:
Methods, systems and articles of manufacture for automatically starting a node in a clustered computer system. A starting state value may be assigned to the node and a discovery process initiated to find a sponsor node. If a sponsor node is found, the node is joined with the sponsor node in the clustered computer system. If a sponsor node is not found, the node is started as a one-node cluster in the clustered computer system. An active state value is assigned to the node upon inclusion into the clustered computer system.
摘要:
An apparatus, program product and method support the dynamic modification of cluster communication parameters such as a fragmentation size parameter through controllably deferring the processing of a requested fragmentation size change in a source node until after receipt an acknowledgment message for at least one unacknowledged message sent by the source node to a plurality of target nodes. By controllably deferring such processing until it is confirmed that any such previously-unacknowledged messages sent by a source node have been received by any target nodes, synchronization between the source node and the target nodes may be obtained, and a fragmentation size change may occur in a coordinated fashion such that future messages from the source node to the target node will be processed by both the source and the target nodes using the modified fragmentation size parameter.
摘要:
A clustered computer system, apparatus, program product and method utilize a group member-initiated shutdown process to terminate clustering on a node in an automated and orderly fashion, typically in the event of a failure detected by a group member residing on that node. As a component of such a process, node leave operations are initiated on the other nodes in a clustered computer system, thereby permitting any dependency failovers to occur in an automated fashion. Moreover, other group members on a node to be shutdown are preemptively terminated prior to local detection of the failure within those other group members, so that termination of clustering on the node may be initiated to complete a shutdown operation.
摘要:
An apparatus and method allows processing sequenced records across multiple network connections. A “logical connection” is defined to include one or more network connections. Each message is assigned a sequence number that allows the messages to be ordered on the other end according to sequence number, regardless of which network connection in the logical connection is used to transfer the message. By defining messages, sequencing those messages, and transferring the messages over multiple network connections, the throughput and performance of networked computer systems are substantially increased.