摘要:
A first Virtual Input/Output Server (VIOS) of a VIOS cluster performs the functions of: generating, at a sending daemon of the first VIOS, a send message that is to be transmitted to a receiving daemon at a second VIOS; in response to completion of the generating of the send message, forwarding the send message to a sending virtual small computer systems interface (vscsi) kernel extension (VKE) via a system call interface; and in response to the sending VKE receiving the send message from the sending daemon, forwarding the send message to one or more second VIOSes within the VIOS cluster utilizing a kcluster interface. The sending VKE parses at least one of a message header and a sub-header of the send message; and responsive to detection of a broadcast setting for the send message, the VKE forwards the send message to all nodes within the cluster via a cluster broadcast.
摘要:
A first Virtual Input/Output Server (VIOS) of a VIOS cluster performs the functions of: generating, at a sending daemon of the first VIOS, a send message that is to be transmitted to a receiving daemon at a second VIOS; in response to completion of the generating of the send message, forwarding the send message to a sending virtual small computer systems interface (vscsi) kernel extension (VKE) via a system call interface; and in response to the sending VKE receiving the send message from the sending daemon, forwarding the send message to one or more second VIOSes within the VIOS cluster utilizing a kcluster interface. The sending VKE parses at least one of a message header and a sub-header of the send message; and responsive to detection of a broadcast setting for the send message, the VKE forwards the send message to all nodes within the cluster via a cluster broadcast.
摘要:
A first virtual I/O server (VIOS) provides a cluster aware (CA) operating system (OS) executing on a processor resource of the first VIOS to register the first VIOS within a VIOS cluster. The first VIOS comprises a first field/failure data capture (FFDC) module that executes within the first VIOS and performs the functions of: receiving from an event listener a signal indicating that an FFFDC event/condition has been detected by the first VIOS; and automatically transmitting FFDC data to the shared storage repository for storage of the FFDC data within the shared storage repository. The FFDC module further performs the functions of: transmitting to one or more second VIOSes within the VIOS cluster, one or more messages to inform the one or more second VIOSes of an occurrence of the FFDC event/condition that was detected by the first VIOS.
摘要:
A first virtual I/O server (VIOS) provides a cluster aware (CA) operating system (OS) executing on a processor resource of the first VIOS to register the first VIOS within a VIOS cluster. The first VIOS comprises a first field/failure data capture (FFDC) module that executes within the first VIOS and performs the functions of: receiving from an event listener a signal indicating that an FFFDC event/condition has been detected by the first VIOS; and automatically transmitting FFDC data to the shared storage repository for storage of the FFDC data within the shared storage repository. The FFDC module further performs the functions of: transmitting to one or more second VIOSes within the VIOS cluster, one or more messages to inform the one or more second VIOSes of an occurrence of the FFDC event/condition that was detected by the first VIOS.
摘要:
A method, system, and computer program product provides simultaneous debugging of multiple OS image and/or system dump pairs in a distributed storage repository. A management console receives a terminal debugging session request and a cluster selection from an interface and starts a debugger instance. The debugger instance autonomously identifies client LPARs and loads the system dump images assigned to the client LPARs. In response to receiving a selection of a first and second client LPARs, the debugger analyzes the first and second system dump images, respectively, and calculates relational information between the first analysis and the second analysis via one or more logical reasoning utilities of the management console. The debugger then loads the relational information to the management console interface with an analysis of one or more similarities between the first and second system dumps.
摘要:
Hibernation and remote restore functions of a client logical partition (LPAR) that exists within a data processing system having cluster-aware Virtual Input/Output (I/O) Servers (VIOSes) is performed via receipt of commands via a virtual control panel (VCP) through an underlying hypervisor. The client hibernation data file is stored in a shared repository by a source/original VIOS assigned to the client. The hypervisor receives a remote restart command and assigns a target/remote client LPAR and a target VIOS. The source I/O adapters and target I/O adapters are locked and the target VIOS gathers adapter configuration information from the source VIOS and configures the target adapters to be able to perform the I/O functionality provided by the source adapters to the client LPAR. The target VIOS then retrieves the client's hibernation data file, and the client LPAR is restored at the remote LPAR with the target VIOS providing the client's I/O functionality.
摘要:
In a data processing system having a plurality of virtual input/output servers (VIOSes) configured within a VIOS cluster, a method, data processing system and computer program product provide for autonomous election of a primary node within a virtual input/output server (VIOS) cluster. A first VIOS performs the functions of: detecting that a primary node is required for the VIOS cluster; and autonomously initiating an election process to elect a next primary node from among the VIOSes within the VIOS cluster. When the first VIOS meets the pre-established requirements for becoming a primary node, the first VIOS obtains a lock on a primary node ID field within a VIOS database (DB) and then initiates a primary node commit process to assign the first VIOS as the primary node. The first VIOS issues a notification to the VIOS cluster to notify the other VIOSes that a primary node has been elected.
摘要:
A method, system, and computer program product provide a shared virtual memory space via a cluster-aware virtual input/output (I/O) server (VIOS). The VIOS receives a paging file request from a first LPAR and thin-provisions a logical unit (LU) within the virtual memory space as a shared paging file of the same storage amount as the minimum required capacity. The VIOS also autonomously maintains a logical redundancy LU (redundant LU) as a real-time copy of the provisioned/allocated LU, where the redundant LU is a dynamic copy of the allocated LU that is autonomously updated responsive to any changes within the allocated LU. Responsive to a second VIOS attempting to read a LU currently utilized by a first VIOS, the read request is autonomously redirected to the logical redundancy LU. The redundant LU can be utilized to facilitate migration of a client LPAR to a different computing electronic complex (CEC).
摘要:
A method, data processing system and computer program product provide scalable data synchronization for a virtual input/output server (VIOS) cluster and one or more registered callers. A first VIOS is commits as a primary node of the VIOS cluster and performs the functions of: registering one or more callers to receive notification from the first VIOS of specific events occurring within the cluster; receiving notification of an occurrence of one of the specific events; and in response to receiving notification of the specific events, a deamon of the first VIOS retrieving a message payload file from a message payload file directory within the shared VIOS DB and passing the message payload file to the API, which forwards/posts the relevant event notification information from the message payload file to the TCP socket of each registered caller.
摘要:
A method, data processing system, and computer program product autonomously migrate clients serviced by a first VIOS to other VIOSes in the event of a VIOS cluster “split-brain” scenario generating a primary sub-cluster and a secondary sub-cluster, where the first VIOS is in the secondary sub-cluster. The VIOSes in the cluster continually exchange keep-alive information to provide each VIOS with an up-to-date status of other VIOSes within the cluster and to notify the VIOSes when one or more nodes loose connection to or are no longer communicating with other nodes within the cluster, as occurs with a cluster split-brain event/condition. When this event is detected, a first sub-cluster assumes a primary sub-cluster role and one or more clients served by one or more VIOSes within the secondary sub-cluster are autonomously migrated to other VIOSes in the primary sub-cluster, thus minimizing downtime for clients previously served by the unavailable/uncommunicative VIOSes.