Managing variations among nodes in parallel system frameworks

    公开(公告)号:US10355966B2

    公开(公告)日:2019-07-16

    申请号:US15081558

    申请日:2016-03-25

    Abstract: Systems, apparatuses, and methods for managing variations among nodes in parallel system frameworks. Sensor and performance data associated with the nodes of a multi-node cluster may be monitored to detect variations among the nodes. A variability metric may be calculated for each node of the cluster based on the sensor and performance data associated with the node. The variability metrics may then be used by a mapper to efficiently map tasks of a parallel application to the nodes of the cluster. In one embodiment, the mapper may assign the critical tasks of the parallel application to the nodes with the lowest variability metrics. In another embodiment, the hardware of the nodes may be reconfigured so as to reduce the node-to-node variability.

    Setting Operating Points for Circuits in an Integrated Circuit Chip

    公开(公告)号:US20190123648A1

    公开(公告)日:2019-04-25

    申请号:US16130136

    申请日:2018-09-13

    Abstract: The described embodiments include an apparatus that controls voltages for an integrated circuit chip having a set of circuits. The apparatus includes a switching voltage regulator separate from the integrated circuit chip and two or more low dropout (LDO) regulators fabricated on the integrated circuit chip. The switching voltage regulator provides an output voltage that is received as an input voltage by each of the two or more LDO regulators, and each of the two or more LDO regulators provides a local output voltage, each local output voltage received as a local input voltage by a different subset of the circuits in the set of circuits. During operation, a controller sets an operating point for each of the subsets of circuits based on a combined power efficiency for the subsets of the circuits and the LDO regulators, each operating point including a corresponding frequency and voltage.

    Balancing computation and communication power in power constrained clusters

    公开(公告)号:US09983652B2

    公开(公告)日:2018-05-29

    申请号:US14959669

    申请日:2015-12-04

    CPC classification number: G06F1/3203 G06F1/3206 G06F1/3287 Y02D10/171

    Abstract: Systems, apparatuses, and methods for balancing computation and communication power in power constrained environments. A data processing cluster with a plurality of compute nodes may perform parallel processing of a workload in a power constrained environment. Nodes that finish tasks early may be power-gated based on one or more conditions. In some scenarios, a node may predict a wait duration and go into a reduced power consumption state if the wait duration is predicted to be greater than a threshold. The power saved by power-gating one or more nodes may be reassigned for use by other nodes. A cluster agent may be configured to reassign the unused power to the active nodes to expedite workload processing.

    ACHIEVING BALANCED EXECUTION THROUGH RUNTIME DETECTION OF PERFORMANCE VARIATION

    公开(公告)号:US20170373955A1

    公开(公告)日:2017-12-28

    申请号:US15192764

    申请日:2016-06-24

    CPC classification number: G06F11/30 G06F9/4893 G06F2209/5019 Y02D10/24

    Abstract: Systems, apparatuses, and methods for achieving balanced execution in a multi-node cluster through runtime detection of performance variation are described. During a training phase, performance counters and an amount of time spent waiting for synchronization is monitored for a plurality of tasks for each node of the multi-node cluster. These values are utilized to generate a model which correlates the values of the performance counters to the amount of time spent waiting for synchronization. Once the model is built, the values of the performance counters are monitored for a period of time at the start of each task, and these values are input into the model. The model generates a prediction of whether a given node is on the critical path. If the given node is predicted to be on the critical path, the power allocation of the given node is increased.

    Power management of instruction processors in a system-on-a-chip

    公开(公告)号:US10133574B2

    公开(公告)日:2018-11-20

    申请号:US15181837

    申请日:2016-06-14

    Abstract: A system-on-a-chip includes a plurality of instruction processors and a hardware block such as a system management unit. The hardware block accesses values of performance counters associated with the plurality of instruction processors and modifies one or more operating points of one or more of the plurality of instruction processors based on comparisons of the instruction arrival rates and the instruction service rates to achieve optimized system metrics.

    System and method for determining concurrency factors for dispatch size of parallel processor kernels

    公开(公告)号:US09965343B2

    公开(公告)日:2018-05-08

    申请号:US14710879

    申请日:2015-05-13

    CPC classification number: G06F9/545 G06F9/44505 Y02D10/43

    Abstract: Disclosed is a method of determining concurrency factors for an application running on a parallel processor. Also disclosed is a system for implementing the method. In an embodiment, the method includes running at least a portion of the kernel as sequences of mini-kernels, each mini-kernel including a number of concurrently executing workgroups. The number of concurrently executing workgroups is defined as a concurrency factor of the mini-kernel. A performance measure is determined for each sequence of mini-kernels. From the sequences, a particular sequence is chosen that achieves a desired performance of the kernel, based on the performance measures. The kernel is executed with the particular sequence.

Patent Agency Ranking