摘要:
A partition manager includes an I/O reconfiguration mechanism and a logical partition suspend/resume mechanism that work together to perform autonomic I/O reconfiguration in a logically partitioned computer system. When I/O reconfiguration is required, the affected logical partitions are suspended, the I/O is reconfigured, and the affected logical partitions are resumed. Because the logical partitions are suspended during I/O reconfiguration, any ghost packet that may occur when the I/O is reconfigured is ignored.
摘要:
A logically-partitioned computer, program product and method utilize a flexible and adaptable communication interface between a partition and a partition manager, which permits optimal handling of partition management operations such as state change operations and the like over a wide variety of circumstances. In particular, a partition is permitted to indicate, in connection with a request to perform a partition management operation, whether an asynchronous notification should be generated or suppressed in association with the performance of the partition management operation by a partition manager. As a result, asynchronous notifications are selectively generated in association with the performance of partition management operations based upon indications in the requests made by partitions for such operations.
摘要:
A partition manager includes a resource detection mechanism that uses a persistent resource database to determine which resources were seen previously, and to determine which resources are required for a logical partition to start. Once all required resources for a logical partition are detected, the logical partition is started. In this manner, a logical partition may be started as soon as all of its resources are available, without waiting on the resources of other logical partitions. In addition, a missing required resource will prevent a logical partition from starting, thus avoiding the crash of a logical partition due to missing resources.
摘要:
A resource and partition manager of the preferred embodiments includes a lock mechanism that operates on a plurality of locks that control access to individual I/O slots. The resource and partition manager uses the lock mechanism to obtain a lock on an I/O slot when transferring control of the I/O slot to a logical partition that is powering on and when removing the I/O slot from a logical partition that is powering off. The resource and partition manager uses the lock mechanism to remove control of an I/O slot from, or return control to, an operating logical partition in order to facilitate hardware service operations on that I/O slot or on the physical enclosure in which it is contained.
摘要:
In a computer system that includes multiple physical enclosures, an enclosure includes a non-volatile memory that includes bus numbering information for its own buses as well as bus numbering information for one or more of its neighbors. In the preferred implementation, all enclosures include a non-volatile memory that includes bus numbering information for its own buses and for both of its neighbors. This creates a distributed database of the interconnection topology for the computer system. Because an enclosure contains bus numbering information about its neighbor enclosure(s), the bus numbers for the buses in the physical enclosures are made persistent across numerous different system reconfigurations. The preferred embodiments also include a bus number manager that reads the non-volatile memories in the physical enclosures during initial program load (i.e., boot) that reconstructs the interconnection topology from the information read from the non-volatile memories, and that assigns bus numbers to the buses according to the derived interconnection topology.
摘要:
An apparatus, program product and method propagate errors detected in an IO fabric element from an IO fabric that is used to couple a plurality of endpoint IO resources to processing elements in a computer. In particular, such errors are propagated to the endpoint IO resources affected by the IO fabric element in connection with recovering from the errors in the IO fabric element. By doing so, a device driver or other program code used to access each affected IO resources may be permitted to asynchronously recover from the propagated error in its associated IO resource, and often without requiring the recovery from the error in the IO fabric element to wait for recovery to be completed for each of the affected IO resources. In addition, an IO fabric may be dynamically configured to support both recoverable and non-recoverable endpoint IO resources. In particular, IO fabric elements within an IO fabric may be dynamically configured to enable machine check signaling in such IO fabric elements in response to detection that an endpoint IO resource is non-recoverable in nature. The IO fabric elements that are dynamically configured as such are disposed within a hardware path that is defined between the non-recoverable resource and a processor that accesses the non-recoverable resource.
摘要:
An apparatus, program product and method in which a memory address space is allocated non-uniformly to IO resources in a memory mapped IO fabric based upon the locations of individual IO endpoints to which such IO resources are coupled. In a PCI-based environment, for example, PCI adapters are allocated memory address ranges in a PCI bus address space based upon the locations of the particular slots within which the PCI adapters are mounted.
摘要:
Method, apparatus and system for isolating input/output adapter interrupt domains in a data processing system. The data processing system includes a plurality of input/output adapters, and isolation of interrupt resources available to the input/output adapters is controlled by functionality in a host bridge that connects the plurality of input/output adapters to a system bus of the data processing system, thus permitting the use of low cost, industry standard switches and bridges external to the host bridge.
摘要:
Method, apparatus and system for controlling input/output adapter data flow operations in a data processing system that includes at least one of a traffic class mechanism in conjunction with virtual channel resources so as to be able to associate Load/Store and DMA flows to/from an input/output adapter, and a relaxed ordering mechanism for associating a relaxed ordering bit to Load/Store operations to an input/output adapter. Functionality for controlling the input/output adapter data flow is provided in a host bridge that connects the input/output adapter to a system bus of the data processing system.
摘要:
Method, apparatus and system for isolating input/output adapter Direct Memory Access addressing domains in a data processing system. The data processing system includes a plurality of input/output adapters, and access to a memory of the data processing system by the plurality of input/output adapters is controlled by functionality in a host bridge that connects the plurality of input/output adapters to a system bus of the data processing system, thus permitting the use of low cost, industry standard switches and bridges external to the host bridge.