Method and apparatus for managing power in a thermal couple aware system

    公开(公告)号:US10955884B2

    公开(公告)日:2021-03-23

    申请号:US15071643

    申请日:2016-03-16

    Abstract: A method and apparatus for managing power in a thermal couple aware system includes determining a candidate configuration mapping based upon one or more criteria, the candidate configuration mapping being a mapping of performance for a candidate configuration of processor sockets in the thermal couple aware system. The candidate configuration mapping is evaluated by comparing the candidate configuration mapping to a stored configuration. If the evaluated candidate configuration mapping provides a better metric than the stored configuration, the stored configuration is updated with the evaluated candidate configuration mapping, and programming instructions are executed in accordance with the candidate configuration mapping if no other configuration mappings are to be determined.

    BANDWIDTH-AWARE MULTI-FREQUENCY PERFORMANCE ESTIMATION MECHANISM

    公开(公告)号:US20180210531A1

    公开(公告)日:2018-07-26

    申请号:US15416993

    申请日:2017-01-26

    Abstract: Systems, apparatuses, and methods for implementing performance estimation mechanisms are disclosed. In one embodiment, a computing system includes at least one processor and a memory subsystem. During a characterization phase, the system utilizes a memory intensive workload to detect when the memory subsystem reaches its saturation point. Then, the system collects performance counter values during a sampling phase of a target application to determine the memory bandwidth. If the memory bandwidth is greater than the saturation point, then the system generates a prediction of the memory time which is based on a ratio of the memory bandwidth over the saturation point. Otherwise, if the memory bandwidth is less than the saturation point, the system assumes memory time is constant versus processor frequency. Then, the system uses the memory time and an estimate of the compute time to estimate a phase time for the target application at different processor frequencies.

    BALANCING COMPUTATION AND COMMUNICATION POWER IN POWER CONSTRAINED CLUSTERS

    公开(公告)号:US20170160781A1

    公开(公告)日:2017-06-08

    申请号:US14959669

    申请日:2015-12-04

    CPC classification number: G06F1/3203 G06F1/3206 G06F1/3287 Y02D10/171

    Abstract: Systems, apparatuses, and methods for balancing computation and communication power in power constrained environments. A data processing cluster with a plurality of compute nodes may perform parallel processing of a workload in a power constrained environment. Nodes that finish tasks early may be power-gated based on one or more conditions. In some scenarios, a node may predict a wait duration and go into a reduced power consumption state if the wait duration is predicted to be greater than a threshold. The power saved by power-gating one or more nodes may be reassigned for use by other nodes. A cluster agent may be configured to reassign the unused power to the active nodes to expedite workload processing.

    SYSTEM AND METHOD FOR DETERMINING CONCURRENCY FACTORS FOR DISPATCH SIZE OF PARALLEL PROCESSOR KERNELS
    76.
    发明申请
    SYSTEM AND METHOD FOR DETERMINING CONCURRENCY FACTORS FOR DISPATCH SIZE OF PARALLEL PROCESSOR KERNELS 有权
    用于确定并行处理器卡尺的分配因子的系数和方法

    公开(公告)号:US20160335143A1

    公开(公告)日:2016-11-17

    申请号:US14710879

    申请日:2015-05-13

    CPC classification number: G06F9/545 G06F9/44505 Y02D10/43

    Abstract: Disclosed is a method of determining concurrency factors for an application running on a parallel processor. Also disclosed is a system for implementing the method. In an embodiment, the method includes running at least a portion of the kernel as sequences of mini-kernels, each mini-kernel including a number of concurrently executing workgroups. The number of concurrently executing workgroups is defined as a concurrency factor of the mini-kernel. A performance measure is determined for each sequence of mini-kernels. From the sequences, a particular sequence is chosen that achieves a desired performance of the kernel, based on the performance measures. The kernel is executed with the particular sequence.

    Abstract translation: 公开了一种确定并行处理器上运行的应用程序的并发因子的方法。 还公开了一种用于实现该方法的系统。 在一个实施例中,该方法包括将内核的至少一部分作为小型内核的序列运行,每个小型内核包括多个并发执行的工作组。 并发执行工作组的数量被定义为小型内核的并发因子。 针对每个小型内核序列确定性能指标。 从序列中,基于性能测量,选择实现内核所需性能的特定序列。 内核以特定顺序执行。

    HARDWARE AND RUNTIME COORDINATED LOAD BALANCING FOR PARALLEL APPLICATIONS
    78.
    发明申请
    HARDWARE AND RUNTIME COORDINATED LOAD BALANCING FOR PARALLEL APPLICATIONS 有权
    硬件和运行协调负载均衡并行应用

    公开(公告)号:US20160259667A1

    公开(公告)日:2016-09-08

    申请号:US14641220

    申请日:2015-03-06

    Abstract: A method of balancing execution rates for a plurality of parallel program loops being executed concurrently by a processor may include estimating a completion time for each program loop of the plurality of program loops, determining a difference between the estimated completion time of a first program loop of the plurality of program loops and the estimated completion time of a second program loop of the plurality of program loops, and decreasing the difference by adjusting an execution rate of the first program loop.

    Abstract translation: 一种平衡处理器并行执行的多个并行程序循环的执行率的方法可以包括:估计多个程序循环中每个程序循环的完成时间,确定第一程序循环的估计完成时间之间的差异 所述多个程序循环的多个程序循环和所述多个程序循环的第二程序循环的估计完成时间,以及通过调整所述第一程序循环的执行速率来减小所述差异。

Patent Agency Ranking