摘要:
A method, system, and data processing system for dynamic detection of problem components in a hot-plug processing system and automatic removal of the problem component via hot-removal methods without disrupting processing of the overall system. A data processing system that provides a non-disruptive, hot-plug functionality is designed with a additional logic for initiating and/or completing a sequence of factory level tests on hot-pluggable components to determine if the component if functioning properly. When a component is not functioning properly, the OS re-allocates the workload of the component to other component so the system, and when the OS completes the re-allocation, the service element initiates the hot removal of the component so that the component is logically and electrically separated from the system.
摘要:
A data processing system that provides hot-plug add and remove functionality for individual, hot-pluggable components without disrupting current operations of the overall processing system. The processing system includes an interconnect fabric that includes hot plug connector at which an external hot-pluggable component can be coupled to the data processing system and logic components include configuration logic and routing and operating logic. When a hot-pluggable component is connected to the hot plug connector, the service element automatically detects the connection and selects the correct configuration file for the extended system. Once the configuration file is loaded and the system checks of the new element indicates the new element is ready for integration, the new element is integrated into the existing system, and the OS allocates workload to the new element. From a customer perspective, the entire process thus occurs without powering down or disrupting the operation of the existing element.
摘要:
A data processing system includes a mechanism to periodically idle the normal system operation to allow recalibration of its interface circuitry by transmission of data with transitions and logic levels indicative of actual operation. Provision is made to protect actual data of the system from corruption during recalibration.
摘要:
A technique for operating a high performance computing cluster (HPC) having multiple nodes (each of which include multiple processors) includes periodically broadcasting information, related to processor utilization and network utilization at each of the multiple nodes, from each of the multiple nodes to remaining ones of the multiple nodes. Respective local job tables maintained in each of the multiple nodes are updated based on the broadcast information. One or more threads are then moved from one or more of the multiple processors to a different one of the multiple processors (based on the broadcast information in the respective local job tables).
摘要:
Disclosed are a method, a system and a computer program product for operating a cache system. The cache system can include multiple cache lines, and a first cache line of the multiple of cache lines can include multiple cache cells, and a bus coupled to the multiple cache cells. In one or more embodiments, the bus can include a switch that is operable to receive a first control signal and to split the bus into first and second portions or aggregate the bus into a whole based on the first control signal. When the bus is split, a first cache cell and a second cache cell of the multiple cache cells are coupled to respective first and second portions of the bus. Data from the first and second cache cells can be selected through respective portions of the bus and outputted through a port of the cache system.
摘要:
A processor communication register (PCR) contained within a multiprocessor cluster system provides enhanced processor communication. The PCR stores information that is useful in pipelined or parallel multi-processing. Each processor cluster has exclusive rights to store to a sector within the PCR and has continuous access to read its contents. Each processor cluster updates its exclusive sector within the PCR, instantly allowing all of the other processors within the cluster network to see the change within the PCR data, and bypassing the cache subsystem. Efficiency is enhanced within the processor cluster network by providing processor communications to be immediately networked and transferred into all processors without momentarily restricting access to the information or forcing all the processors to be continually contending for the same cache line, and thereby overwhelming the interconnect and memory system with an endless stream of load, store and invalidate commands.
摘要:
A technique for operating a high performance computing (HPC) cluster includes monitoring workloads of multiple processors included in the HPC cluster. The HPC cluster includes multiple nodes that each include two or more of the multiple processors. One or more threads assigned to one or more of the multiple processors are moved to a different one of the multiple processors based on the workloads of the multiple processors.
摘要:
A symmetric multiprocessor data processing system (SMP) that implements a TLBI protocol, which enables multiple TLBI operations from multiple processors to complete without causing delay. Each processor includes a TLBI register associated with the TLB and TLBI logic. The TLBI register includes a sequence of bits utilized to track the completion of a TLBI issued by the processor at the other processors. Each bit corresponds to a particular processor across the system and the particular processor is able to directly set the bit in the register of a master processor once the particular processor completes a TLBI operation initiated from the master processor. The master processor is able to track completion of the TLBI operation by checking the values of each bit within its TLBI register, without requiring multi-issuance of an address-only barrier operation on the system bus.
摘要:
An Ethernet adapter is disclosed. The Ethernet adapter comprises a plurality of layers for allowing the adapter to receive and transmit packets from and to a processor. The plurality of layers include a demultiplexing mechanism to allow for partitioning of the processor. A Host Ethernet Adapter (HEA) is an integrated Ethernet adapter providing a new approach to Ethernet and TCP acceleration. A set of TCP/IP acceleration features have been introduced in a toolkit approach: Servers TCP/IP stacks use these accelerators when and as required. The interface between the server and the network interface controller has been streamlined by bypassing the PCI bus. The HEA supports network virtualization. The HEA can be shared by multiple OSs providing the essential isolation and protection without affecting its performance.
摘要:
According to a method of operating a data processing system, one or more monitoring parameter sets are established in a processing unit within the data processing system. The processing unit monitors, in hardware, execution of each of a plurality of schedulable software entities within the processing unit in accordance with a monitoring parameter set among the one or more monitoring parameter sets. The processing unit then reports to software executing in the data processing system utilization of hardware resources by each of the plurality of schedulable software entities. The hardware utilization information reported by the processing unit may be stored and utilized by software to schedulable execution of the schedulable software entities reported by the processing unit. The hardware utilization information may also be utilized to generate a classification of at least one executing schedulable software entity, which may be communicated to the processing unit to dynamically modify an allocation of hardware resources to the schedulable software entity.