摘要:
Proactive power management in a parallel computer, the parallel computer including a service node and a plurality of compute nodes, the service node connected to the compute nodes through an out-of-band service network, each compute node including a computer processor and a computer memory operatively coupled to the computer processor. Embodiments include receiving, by the service node, a user instruction to initiate a job on an operational group of compute nodes in the parallel computer, the instruction including power management attributes for the compute nodes; setting, by the service node in accordance with the power management attributes for the compute nodes of the operational group, power consumption ratios for each compute node of the operational group including a computer processor power consumption ratio and a computer memory power consumption ratio; and initiating, by the service node, the job on the compute nodes of the operational group of the parallel computer.
摘要:
Managing power in a parallel computer, the parallel computer including a power supply and a plurality of compute nodes, the plurality of compute nodes powered by the power supply through a plurality of DC-DC converters, each DC-DC converter supplying current to an assigned group of compute nodes, each DC-DC converter having a current sensor. Embodiments include monitoring, by the current sensor, an amount of current supplied by that DC-DC converter to its assigned group of compute nodes; determining, by at least one DC-DC converter, that the amount of current supplied is greater than a predefined threshold value; sending, by the at least one DC-DC converter to the plurality of compute nodes, a global interrupt, including notifying the plurality of compute nodes to reduce power consumption; and reducing, by the plurality of compute nodes in accordance with power consumption ratios, power consumption of the compute nodes.
摘要:
Managing power in a parallel computer, the parallel computer including a power supply and a plurality of compute nodes, the plurality of compute nodes powered by the power supply through a plurality of DC-DC converters, each DC-DC converter supplying current to an assigned group of compute nodes, each DC-DC converter having a current sensor. Embodiments include monitoring, by the current sensor, an amount of current supplied by that DC-DC converter to its assigned group of compute nodes; determining, by at least one DC-DC converter, that the amount of current supplied is greater than a predefined threshold value; sending, by the at least one DC-DC converter to the plurality of compute nodes, a global interrupt, including notifying the plurality of compute nodes to reduce power consumption; and reducing, by the plurality of compute nodes in accordance with power consumption ratios, power consumption of the compute nodes.
摘要:
A circuit generates a global clock signal with a pulse width modification to synchronize processors in a parallel computing system. The circuit may include a hardware module and a clock splitter. The hardware module may generate a clock signal and performs a pulse width modification on the clock signal. The pulse width modification changes a pulse width within a clock period in the clock signal. The clock splitter may distribute the pulse width modified clock signal to a plurality of processors in the parallel computing system.
摘要:
An apparatus, method and computer program product for automatically controlling power dissipation of a parallel computing system that includes a plurality of processors. A computing device issues a command to the parallel computing system. A clock pulse-width modulator encodes the command in a system clock signal to be distributed to the plurality of processors. The plurality of processors in the parallel computing system receive the system clock signal including the encoded command, and adjusts power dissipation according to the encoded command.
摘要:
Fixing a problem is usually greatly aided if the problem is reproducible. To ensure reproducibility of a multiprocessor system, the following aspects are proposed: a deterministic system start state, a single system clock, phase alignment of clocks in the system, system-wide synchronization events, reproducible execution of system components, deterministic chip interfaces, zero-impact communication with the system, precise stop of the system and a scan of the system state.
摘要:
A circuit generates a global clock signal with a pulse width modification to synchronize processors in a parallel computing system. The circuit may include a hardware module and a clock splitter. The hardware module may generate a clock signal and performs a pulse width modification on the clock signal. The pulse width modification changes a pulse width within a clock period in the clock signal. The clock splitter may distribute the pulse width modified clock signal to a plurality of processors in the parallel computing system.
摘要:
An apparatus, method and computer program product for automatically controlling power dissipation of a parallel computing system that includes a plurality of processors. A computing device issues a command to the parallel computing system. A clock pulse-width modulator encodes the command in a system clock signal to be distributed to the plurality of processors. The plurality of processors in the parallel computing system receive the system clock signal including the encoded command, and adjusts power dissipation according to the encoded command.
摘要:
Fixing a problem is usually greatly aided if the problem is reproducible. To ensure reproducibility of a multiprocessor system, the following aspects are proposed: a deterministic system start state, a single system clock, phase alignment of clocks in the system, system-wide synchronization events, reproducible execution of system components, deterministic chip interfaces, zero-impact communication with the system, precise stop of the system and a scan of the system state.
摘要:
A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/O on an application by application basis, and preferably, enable adaptive partitioning of functions in accordance with various algorithmic phases within an application, or if I/O or other processors are underutilized, then can participate in computation or communication nodes are interconnected by a five dimensional torus network with DMA that optimally maximize the throughput of packet communications between nodes and minimize latency.
摘要翻译:具有100 petaOPS规模计算的多Petascale高效并行超级计算机,其成本,功耗和占地面积都在降低,并且允许从互连角度来看处理节点的最大封装密度。 超级计算机利用了VLSI的技术进步,实现了许多处理器可以集成到单个专用集成电路(ASIC)中的计算模型。 每个ASIC计算节点包括利用集成到一个管芯中的四个或更多个处理器的片上系统ASIC,每个处理器具有对所有系统资源的完全访问,并且使得处理器能够对诸如计算或消息传递I / O 并且优选地,根据应用内的各种算法阶段实现功能的自适应分割,或者如果I / O或其他处理器未被充分利用,则可以参与计算或通信节点通过五维环面网络互连 使用DMA来最大限度地最大化节点之间的分组通信的吞吐量并最小化等待时间。