Abstract:
A computer implemented method, a tangible computer storage medium, and a data processing system provide high availability support for virtual machines in a logical partitioned platform. A monitoring system detect a failure in the virtual machine. Partition management firmware then restarts the virtual machine in a consistency failover image node utilizing a consistency failover image. If a subsequent failure of the virtual machine is detected within a predetermined time, partition management firmware restarts the virtual machine in a boot failover image node utilizing a boot failover image.
Abstract:
A computer implemented method, apparatus, and computer program product for restarting pseudo terminal streams. In one embodiment, a device associated with a file descriptor in a set of file descriptors is opened. The set of file descriptors are identified in checkpoint data for restarting the pseudo terminal streams. In response to identifying the device as a pseudo terminal slave device, an entry for the identified pseudo terminal slave device is added to a list of open pseudo terminal slave devices. The entry for the identified pseudo terminal slave device is marked as an open pseudo terminal slave device. The list of open pseudo terminal slave devices permit pseudo terminal master devices and pseudo terminal slave devices to be restored and restarted in random order during a restart of the pseudo terminal streams.
Abstract:
Migrating a workload partition (WPAR) is provided. Responsive to receiving a request to checkpoint the WPAR, a list of virtual identifiers used by the WPAR to refer to IPC objects is generated and stored. Each virtual identifier corresponds to an IPC object and to a real identifier used by a kernel that corresponds to the IPC object. IPC object data and control information is collected and stored. Each process in the WPAR stores per process data. Responsive to receiving a request to restart the WPAR, the virtual identifier that the WPAR wants to be used is registered. A new IPC object is created by a kernel. The kernel maps a real identifier used by the kernel for the new IPC object to the registered virtual identifier. The restart process retrieves IPC data and control information and overlays it on the new IPC object. The per process data is restored.
Abstract:
A data processing system includes: a plurality of resources including a processor, a memory, and an operating system; a mechanism for generating one or more software partitions in addition to an administrative partition; and a global accounting engine which enables monitoring and recording of resource usage at both a global-level and a partition-level. Partition-level accounting data is returned for selected resources being utilized within a software partition. The data processing system also includes a first software partition, which utilizes one or more of the first plurality of resources and which includes a first partition-level accounting engine. The partition-level accounting engine provides monitoring and recording of resource usage within the first software partition and stores first partition usage data within a first partition accounting buffer.
Abstract:
A WPAR is migrated. Responsive to starting a checkpoint process, data and control information is collected and stored for IPC objects in the WPAR. Responsive to receiving a request to restart the WPAR, a type of IPC object is determined. Responsive to a determination that the IPC object is not an IPC shared memory object, a kernel handle that a process wants to be used for a new IPC object is registered. A request to create a new IPC object comprising a name uniquely associated with the IPC object and a WPAR identifier is issued. An entry that matches the name and WPAR identifier is identified and a virtual kernel handle is retrieved. The new IPC object is created. The virtual kernel handle is mapped to a real kernel handle and returned to the process. Data and control information is retrieved and overlaid onto the new IPC object.
Abstract:
A computer implemented method, apparatus, and computer program product for a checkpoint process associated with a device driver in a workload partitioned environment. In response to initiation of a checkpoint process, a stream is frozen. The stream comprises a set of kernel modules driving a device. Freezing the stream prevents any module in the set of kernel modules from sending any messages, other than a checkpoint message, to another module in the set of kernel modules. The message block for each module in the set of kernel modules is updated with internal data to form a restart message. The internal data is data describing a state of the module in the set of kernel modules.
Abstract:
A computer implemented method, apparatus, and computer program product for restarting pseudo terminal streams. In one embodiment, a device associated with a file descriptor in a set of file descriptors is opened. The set of file descriptors are identified in checkpoint data for restarting the pseudo terminal streams. In response to identifying the device as a pseudo terminal slave device, an entry for the identified pseudo terminal slave device is added to a list of open pseudo terminal slave devices. The entry for the identified pseudo terminal slave device is marked as an open pseudo terminal slave device. The list of open pseudo terminal slave devices permit pseudo terminal master devices and pseudo terminal slave devices to be restored and restarted in random order during a restart of the pseudo terminal streams.
Abstract:
A computer implemented method, computer program product, and system for creating a checkpoint of a stream. A stream checkpoint request to create the checkpoint of the stream is received, wherein the stream is used by a process as a communications path, and wherein the communications path is modified by a set of modules. In response to identifying the identity of each module in the set of modules, the identity of each module in the set of modules is stored in the checkpoint. In response to identifying an order of the set of modules, the order of the set of modules is stored in the checkpoint. In response to sending a stream checkpoint message to each module in the set of modules, module data is received from each module in the set of modules to form received module data. The received module data is stored in the checkpoint.
Abstract:
A computer implemented method, a tangible computer storage medium, and a data processing system provide high availability support for virtual machines in a logical partitioned platform. A monitoring system detect a failure in the virtual machine. Partition management firmware then restarts the virtual machine in a consistency failover image node utilizing a consistency failover image. If a subsequent failure of the virtual machine is detected within a predetermined time, partition management firmware restarts the virtual machine in a boot failover image node utilizing a boot failover image.
Abstract:
A computer implemented method, apparatus, and computer program product for a checkpoint process associated with a device driver in a workload partitioned environment. In response to initiation of a checkpoint process, a stream is frozen. The stream comprises a set of kernel modules driving a device. Freezing the stream prevents any module in the set of kernel modules from sending any messages, other than a checkpoint message, to another module in the set of kernel modules. The message block for each module in the set of kernel modules is updated with internal data to form a restart message. The internal data is data describing a state of the module in the set of kernel modules.