摘要:
Methods, systems, and articles of manufacture for replacement of a failing processor of a multi-processor system running at least one operating system are provided. In contrast to the prior art, the replacement may be performed by system firmware without intervention by the operating system (i.e., the replacement may be transparent to the operating system). For some embodiments, the multi-processor system may be logically partitioned and the methods may be utilized to replace one or more shared or dedicated processors assigned to a logical partition, transparent to an operating system running on the partition.
摘要:
A method and apparatus for transparently handling recurring correctable errors and uncorrectable errors in a mirrored memory system prevents costly system shutdowns for correctable memory errors or system failures from uncorrectable memory errors. When a high number of correctable errors are detected for a given memory location, a memory relocation mechanism in the hypervisor moves the data associated with the memory location to an alternate physical memory location transparently to the partition such that the partition has no knowledge that the physical memory actualizing the memory location has been changed. When a correctable error occurs, the memory relocation mechanism uses data from a partner mirrored memory block as a data source for the memory block with the uncorrectable error and then relocates the data to a newly allocated memory block to replace the memory block with the uncorrectable error.
摘要:
A method and apparatus for transparently handling recurring correctable errors to prevent costly system shutdowns for correctable memory errors or system failures from uncorrectable memory errors. When a high number of correctable errors are detected for a given memory location, the hypervisor moves the data associated with the memory location to an alternate physical memory location transparently to the partition such that the partition has no knowledge that the physical memory actualizing the memory location has been changed. Similarly, the hypervisor can move direct memory access (DMA) memory locations using an I/O translation table.
摘要:
A method and apparatus for transparently handling recurring correctable errors and uncorrectable errors in a mirrored memory system prevents costly system shutdowns for correctable memory errors or system failures from uncorrectable memory errors. When a high number of correctable errors are detected for a given memory location, a memory relocation mechanism in the hypervisor moves the data associated with the memory location to an alternate physical memory location transparently to the partition such that the partition has no knowledge that the physical memory actualizing the memory location has been changed. When a correctable error occurs, the memory relocation mechanism uses data from a partner mirrored memory block as a data source for the memory block with the uncorrectable error and then relocates the data to a newly allocated memory block to replace the memory block with the uncorrectable error.
摘要:
A method and apparatus for transparently handling recurring correctable errors to prevent costly system shutdowns for correctable memory errors or system failures from uncorrectable memory errors. When a high number of correctable errors are detected for a given memory location, the hypervisor moves the data associated with the memory location to an alternate physical memory location transparently to the partition such that the partition has no knowledge that the physical memory actualizing the memory location has been changed. Similarly, the hypervisor can move direct memory access (DMA) memory locations using an I/O translation table.
摘要:
Exemplary methods, systems, and products are described for executing an overall quantity of data processing within an overall processing period that include executing repeatedly through a series of iterations a portion of the overall quantity of data processing that can be completed in a set processing period, wherein each iteration includes the set processing period and a variable delay period and calculating the variable delay period for an iteration in dependence upon the set processing period, a portion of the overall quantity of data processing performed during the set processing period of the iteration, the overall quantity of data processing, and the overall processing period.
摘要:
A task synchronization mechanism operates on a global lock that is shared between processors an on local locks that are not shared between processors. The local locks are processor-specific locks. Each processor-specific lock is dedicated to a particular processor in the system. When shared access to a resource is required, a processor updates its processor-specific lock to indicate the processor is sharing the resource. Because each processor-specific lock is dedicated to a particular processor, this eliminates a significant portion of the memory bus traffic associated with all processors reading and updating the same lock. When exclusive access to a resource is required, the requesting processor waits until the count of all processor-specific locks indicate that none of these processors have a lock on the resource. Once no processor has a lock on the resource, exclusive access to the resource may be granted.
摘要:
A journal mechanism for a database allows simultaneous deposits on multiple journal arms. According to a first embodiment, a journaling system maintains the time-order of interdependent deposits on the journal, but does not necessarily maintain the time-order of deposits that are independent of each other, thereby providing multiple simultaneous deposit points on the journal. The first embodiment provides excellent scaling of journal functions as processors are added to a database computer system. According to a second embodiment, a journaling system maintains the time-order of deposits on the journal, but allows a group of deposits known as a “bundle” to span multiple journal arms, thereby providing multiple simultaneous deposit points on the journal. The second embodiment provides good scaling while providing compatibility with known database systems. The present invention thus relieves contention for the journal that exists as the number of processors increases in a database system.
摘要:
An apparatus and method retrieves a database record from an in-memory database of a parallel computer system using a non-unique key. The parallel computer system performs a simultaneous search on each node of the computer system using the non-unique key and then utilizes a global combining network to combine the local results from the searches of each node to efficiently and quickly search the entire database.
摘要:
A soft lock mechanism controls access by multiple processes to a shared resource to make simultaneous access an unlikely event, while not necessarily preventing simultaneous access. Preferably, the soft lock contains a next_free_time field, specifying when the soft lock will next be available, and a lock_duration, specifying a sufficiently long interval for most accesses to the resource to complete. The lock is obtained by comparing the current time to next_free_time. If the current time is later than next_free_time, then the lock is obtained immediately, and next_free_time is updated to the current time plus lock_duration. If the current time is before next_free_time, then next_free_time is incremented by lock_duration, and the requesting process waits until the old next_free_time to obtain the lock. No action is required to release the lock.