Abstract:
A method and apparatus for managing power in a thermal couple aware system includes determining a candidate configuration mapping based upon one or more criteria, the candidate configuration mapping being a mapping of performance for a candidate configuration of processor sockets in the thermal couple aware system. The candidate configuration mapping is evaluated by comparing the candidate configuration mapping to a stored configuration. If the evaluated candidate configuration mapping provides a better metric than the stored configuration, the stored configuration is updated with the evaluated candidate configuration mapping, and programming instructions are executed in accordance with the candidate configuration mapping if no other configuration mappings are to be determined.
Abstract:
A processing unit monitors, over a sliding time window, the number of instructions dispatched for execution at the processing unit. In response to the number of instructions exceeding a threshold, the processing unit throttles (e.g., pauses) the dispatch of instructions for a specified amount of time. The processing unit thus ensures that the number of dispatched instructions does not change too rapidly in too short a period of time reducing the likelihood of sudden changes in processing unit current, without unduly impacting the performance of the processing unit across all workloads.
Abstract:
A method and apparatus for managing power in a thermal couple aware system includes determining a candidate configuration mapping based upon one or more criteria, the candidate configuration mapping being a mapping of performance for a candidate configuration of processor sockets in the thermal couple aware system. The candidate configuration mapping is evaluated by comparing the candidate configuration mapping to a stored configuration. If the evaluated candidate configuration mapping provides a better metric than the stored configuration, the stored configuration is updated with the evaluated candidate configuration mapping, and programming instructions are executed in accordance with the candidate configuration mapping if no other configuration mappings are to be determined.
Abstract:
An apparatus includes an integrated circuit chip with a set of circuits having two or more subsets of circuits; an external voltage regulator separate from the integrated circuit chip; two or more integrated voltage regulators on the integrated circuit chip that each provide an input voltage to a respective subset of the circuits; and a controller. The controller determines, using an integrated voltage regulator power loss model, an electrical power loss for the integrated voltage regulators for a first combination of operating points for the subsets of the circuits. The controller then determines, based on the electrical power loss, a second combination of operating points for the subsets of the circuits that includes an adjustment to an operating point for at least one of the subsets of the circuits that compensates for an electrical power loss of the corresponding integrated voltage regulator. The controller sets an operating point of each of the subsets of the circuits based on the second combination of operating points.
Abstract:
A processing system includes one or more processing units to perform operations and one or more sensors to measure a temperature concurrently with the one or more processing units performing the operations. The processing system also includes a controller to receive feedback indicating the temperature and to determine a peak temperature and a thermal time constant for heating of the processing system based on a comparison of the measured temperature to a first temperature that is predicted based on the peak temperature and a previously determined thermal time constant for heating. Some embodiments of the controller can control a performance state of the processing system based on the peak temperature and the thermal time constant for heating of the processing system.
Abstract:
The described embodiments include an apparatus that controls voltages for an integrated circuit chip having a set of circuits. The apparatus includes a switching voltage regulator separate from the integrated circuit chip and two or more low dropout (LDO) regulators fabricated on the integrated circuit chip. The switching voltage regulator provides an output voltage that is received as an input voltage by each of the two or more LDO regulators, and each of the two or more LDO regulators provides a local output voltage, each local output voltage received as a local input voltage by a different subset of the circuits in the set of circuits. During operation, a controller sets an operating point for each of the subsets of circuits based on a combined power efficiency for the subsets of the circuits and the LDO regulators, each operating point including a corresponding frequency and voltage.
Abstract:
Systems, apparatuses, and methods for performing temperature-aware task scheduling and proactive power management. A SoC includes a plurality of processing units and a task queue storing pending tasks. The SoC calculates a thermal metric for each pending task to predict an amount of heat the pending task will generate. The SoC also determines a thermal gradient for each processing unit to predict a rate at which the processing unit's temperature will change when executing a task. The SoC also monitors a thermal margin of how far each processing unit is from reaching its thermal limit. The SoC minimizes non-uniform heat generation on the SoC by scheduling pending tasks from the task queue to the processing units based on the thermal metrics for the pending tasks, the thermal gradients of each processing unit, and the thermal margin available on each processing unit.
Abstract:
Systems, apparatuses, and methods for balancing computation and communication power in power constrained environments. A data processing cluster with a plurality of compute nodes may perform parallel processing of a workload in a power constrained environment. Nodes that finish tasks early may be power-gated based on one or more conditions. In some scenarios, a node may predict a wait duration and go into a reduced power consumption state if the wait duration is predicted to be greater than a threshold. The power saved by power-gating one or more nodes may be reassigned for use by other nodes. A cluster agent may be configured to reassign the unused power to the active nodes to expedite workload processing.
Abstract:
Disclosed is a method of determining concurrency factors for an application running on a parallel processor. Also disclosed is a system for implementing the method. In an embodiment, the method includes running at least a portion of the kernel as sequences of mini-kernels, each mini-kernel including a number of concurrently executing workgroups. The number of concurrently executing workgroups is defined as a concurrency factor of the mini-kernel. A performance measure is determined for each sequence of mini-kernels. From the sequences, a particular sequence is chosen that achieves a desired performance of the kernel, based on the performance measures. The kernel is executed with the particular sequence.
Abstract:
A method includes adjusting a maximum skin temperature threshold of a device based on a device state, adjusting a power limit for the device based on the adjusted maximum skin temperature threshold, and operating the device based on the adjusted power limit. A processor includes a processing unit and a power management controller to adjust a maximum skin temperature threshold based on a device state and adjust a power limit for the processing unit based on the adjusted maximum skin temperature threshold.