摘要:
Methods, systems, and articles of manufacture for replacement of a failing processor of a multi-processor system running at least one operating system are provided. In contrast to the prior art, the replacement may be performed by system firmware without intervention by the operating system (i.e., the replacement may be transparent to the operating system). For some embodiments, the multi-processor system may be logically partitioned and the methods may be utilized to replace one or more shared or dedicated processors assigned to a logical partition, transparent to an operating system running on the partition.
摘要:
A task synchronization mechanism operates on a global lock that is shared between processors an on local locks that are not shared between processors. The local locks are processor-specific locks. Each processor-specific lock is dedicated to a particular processor in the system. When shared access to a resource is required, a processor updates its processor-specific lock to indicate the processor is sharing the resource. Because each processor-specific lock is dedicated to a particular processor, this eliminates a significant portion of the memory bus traffic associated with all processors reading and updating the same lock. When exclusive access to a resource is required, the requesting processor waits until the count of all processor-specific locks indicate that none of these processors have a lock on the resource. Once no processor has a lock on the resource, exclusive access to the resource may be granted.
摘要:
An apparatus, program product and method for automatically and transparently determining the time required to migrate a logical partition. This determined latency may be used to update clocks and other time-related values of the migrated logical partition.
摘要:
Diagnostic data, such as a time increment corresponding to how long a thread waits to access a shared resource, is stored within a predetermined location in a data structure, such as a hash bucket in a hash table. The location is preferably correlated to the resource such that a display of the diagnostic data may be tailored to reflect a user-specified relationship between the data and resource.
摘要:
A logical partition debugger allows debugging one logical partition in a computer system without requiring the shutdown of other logical partitions. The logical partition debugger is implemented in software in the partition manager. The logical partition debugger provides many common debug functions known in existing hardware and software debuggers, but does so in a manner that only the partition being debugged is affected.
摘要:
An apparatus, program product, and method utilize a memory access interrupt to effect a reset of a processor in a multi-processor environment. Specifically, a source processor is permitted to initiate a reset of a target processor simply by generating both a reset request and a memory access interrupt for the target processor. The target processor is then specifically configured to detect the presence of a pending reset request during handing of the memory access interrupt, such that the target processor will perform a reset operation responsive to detection of such a request.
摘要:
An apparatus, program product and method support the deallocation of a data structure in a multithreaded computer without requiring the use of computationally expensive semaphores or spin locks. Specifically, access to a data structure is governed by a shared pointer that, when a request is received to deallocate the data structure, is initially set to a value that indicates to any thread that later accesses the pointer that the data structure is not available. In addition, to address any thread that already holds a copy of the shared pointer, and thus is capable of accessing the data structure via the shared pointer after the initiation of the request, all such threads are monitored to determine whether any thread is still using the shared pointer by determining whether any thread is executing program code that is capable of using the shared pointer to access the data structure. Once this condition is met, it is ensured that no thread can potentially access the data structure via the shared pointer, and as such, the data structure may then be deallocated.
摘要:
A partition manager includes an I/O reconfiguration mechanism and a logical partition suspend/resume mechanism that work together to perform autonomic I/O reconfiguration in a logically partitioned computer system. When I/O reconfiguration is required, the affected logical partitions are suspended, the I/O is reconfigured, and the affected logical partitions are resumed. Because the logical partitions are suspended during I/O reconfiguration, any ghost packet that may occur when the I/O is reconfigured is ignored.
摘要:
An apparatus, program product and method for tracking the state of a migrating logical partition. Embodiments may use the state to determine the readiness and/or appropriateness of a page of the logical partition for transferring. The state may include a value or other data used to track changes affecting the page or the relative ease and/or appropriateness of migrating the page. A page manager table with entries corresponding to the state of each page of the logical partition may be used to track the state while the logical partition continues to run during a migration.
摘要:
An apparatus, program product and method support the deallocation of a data structure in a multithreaded computer without requiring the use of computationally expensive semaphores or spin locks. Specifically, access to a data structure is governed by a shared pointer that, when a request is received to deallocate the data structure, is initially set to a value that indicates to any thread that later accesses the pointer that the data structure is not available. In addition, to address any thread that already holds a copy of the shared pointer, and thus is capable of accessing the data structure via the shared pointer after the initiation of the request, all such threads are monitored to determine whether any thread is still using the shared pointer by determining whether any thread is executing program code that is capable of using the shared pointer to access the data structure. Once this condition is met, it is ensured that no thread can potentially access the data structure via the shared pointer, and as such, the data structure may then be deallocated.