Managing variations among nodes in parallel system frameworks

    公开(公告)号:US10355966B2

    公开(公告)日:2019-07-16

    申请号:US15081558

    申请日:2016-03-25

    Abstract: Systems, apparatuses, and methods for managing variations among nodes in parallel system frameworks. Sensor and performance data associated with the nodes of a multi-node cluster may be monitored to detect variations among the nodes. A variability metric may be calculated for each node of the cluster based on the sensor and performance data associated with the node. The variability metrics may then be used by a mapper to efficiently map tasks of a parallel application to the nodes of the cluster. In one embodiment, the mapper may assign the critical tasks of the parallel application to the nodes with the lowest variability metrics. In another embodiment, the hardware of the nodes may be reconfigured so as to reduce the node-to-node variability.

    Balancing computation and communication power in power constrained clusters

    公开(公告)号:US09983652B2

    公开(公告)日:2018-05-29

    申请号:US14959669

    申请日:2015-12-04

    CPC classification number: G06F1/3203 G06F1/3206 G06F1/3287 Y02D10/171

    Abstract: Systems, apparatuses, and methods for balancing computation and communication power in power constrained environments. A data processing cluster with a plurality of compute nodes may perform parallel processing of a workload in a power constrained environment. Nodes that finish tasks early may be power-gated based on one or more conditions. In some scenarios, a node may predict a wait duration and go into a reduced power consumption state if the wait duration is predicted to be greater than a threshold. The power saved by power-gating one or more nodes may be reassigned for use by other nodes. A cluster agent may be configured to reassign the unused power to the active nodes to expedite workload processing.

    ACHIEVING BALANCED EXECUTION THROUGH RUNTIME DETECTION OF PERFORMANCE VARIATION

    公开(公告)号:US20170373955A1

    公开(公告)日:2017-12-28

    申请号:US15192764

    申请日:2016-06-24

    CPC classification number: G06F11/30 G06F9/4893 G06F2209/5019 Y02D10/24

    Abstract: Systems, apparatuses, and methods for achieving balanced execution in a multi-node cluster through runtime detection of performance variation are described. During a training phase, performance counters and an amount of time spent waiting for synchronization is monitored for a plurality of tasks for each node of the multi-node cluster. These values are utilized to generate a model which correlates the values of the performance counters to the amount of time spent waiting for synchronization. Once the model is built, the values of the performance counters are monitored for a period of time at the start of each task, and these values are input into the model. The model generates a prediction of whether a given node is on the critical path. If the given node is predicted to be on the critical path, the power allocation of the given node is increased.

    Bandwidth-aware multi-frequency performance estimation mechanism

    公开(公告)号:US10048741B1

    公开(公告)日:2018-08-14

    申请号:US15416993

    申请日:2017-01-26

    Abstract: Systems, apparatuses, and methods for implementing performance estimation mechanisms are disclosed. In one embodiment, a computing system includes at least one processor and a memory subsystem. During a characterization phase, the system utilizes a memory intensive workload to detect when the memory subsystem reaches its saturation point. Then, the system collects performance counter values during a sampling phase of a target application to determine the memory bandwidth. If the memory bandwidth is greater than the saturation point, then the system generates a prediction of the memory time which is based on a ratio of the memory bandwidth over the saturation point. Otherwise, if the memory bandwidth is less than the saturation point, the system assumes memory time is constant versus processor frequency. Then, the system uses the memory time and an estimate of the compute time to estimate a phase time for the target application at different processor frequencies.

    REAL-TIME PERFORMANCE TRACKING USING DYNAMIC COMPILATION

    公开(公告)号:US20170371761A1

    公开(公告)日:2017-12-28

    申请号:US15192748

    申请日:2016-06-24

    CPC classification number: G06F11/3604 G06F9/45516

    Abstract: Systems, apparatuses, and methods for performing real-time tracking of performance targets using dynamic compilation. A performance target is specified in a service level agreement. A dynamic compiler analyzes a software application executing in real-time and determine which high-level application metrics to track. The dynamic compiler then inserts instructions into the code to increment counters associated with the metrics. A power optimization unit then utilizes the counters to determine if the system is currently meeting the performance target. If the system is exceeding the performance target, then the power optimization unit reduces the power consumption of the system while still meeting the performance target.

    MANAGING CLUSTER-LEVEL PERFORMANCE VARIABILITY WITHOUT A CENTRALIZED CONTROLLER

    公开(公告)号:US20170366412A1

    公开(公告)日:2017-12-21

    申请号:US15183625

    申请日:2016-06-15

    Inventor: Leonardo Piga

    CPC classification number: H04L67/10 G06F9/00 H04L67/1008 H04L67/1029 H04L67/28

    Abstract: Systems, apparatuses, and methods for managing cluster-level performance variability without a centralized controller are described. Each node of a multi-node cluster tracks a maximum and minimum progress across the plurality of nodes for a workload executed by the cluster. Each node also tracks its local progress on its current task. Each node also utilizes a comparison of the local progress to reported maximum and minimum progress across the cluster to identify a critical, or slow, node and whether to increase or reduce an amount of power allocated to the node. The nodes append information about the maximum and minimum progress to messages sent to other nodes to report their knowledge of maximum and minimum progress with other nodes. A node updates its local information if the node receives a message from another node with more up-to-date information about the state of progress across the cluster.

    MANAGING VARIATIONS AMONG NODES IN PARALLEL SYSTEM FRAMEWORKS

    公开(公告)号:US20170279703A1

    公开(公告)日:2017-09-28

    申请号:US15081558

    申请日:2016-03-25

    CPC classification number: H04L43/16 H04L43/08 H04L67/10 H04L67/1008

    Abstract: Systems, apparatuses, and methods for managing variations among nodes in parallel system frameworks. Sensor and performance data associated with the nodes of a multi-node cluster may be monitored to detect variations among the nodes. A variability metric may be calculated for each node of the cluster based on the sensor and performance data associated with the node. The variability metrics may then be used by a mapper to efficiently map tasks of a parallel application to the nodes of the cluster. In one embodiment, the mapper may assign the critical tasks of the parallel application to the nodes with the lowest variability metrics. In another embodiment, the hardware of the nodes may be reconfigured so as to reduce the node-to-node variability.

    STORAGE LOCATION ASSIGNMENT AT A CLUSTER COMPUTE SERVER
    19.
    发明申请
    STORAGE LOCATION ASSIGNMENT AT A CLUSTER COMPUTE SERVER 审中-公开
    存储位置分配在一个集群计算机服务器

    公开(公告)号:US20160173589A1

    公开(公告)日:2016-06-16

    申请号:US14568181

    申请日:2014-12-12

    Abstract: A cluster compute server stores different types of data at different storage volumes in order to reduce data duplication at the storage volumes. The storage volumes are categorized into two classes: common storage volumes and dedicated storage volumes, wherein the common storage volumes store data to be accessed and used by multiple compute nodes (or multiple virtual servers) of the cluster compute server. The dedicated storage volumes, in contrast, store data to be accessed only by a corresponding compute node (or virtual server).

    Abstract translation: 集群计算服务器在不同的存储卷中存储不同类型的数据,以减少存储卷上的数据重复。 存储卷分为两类:常用存储卷和专用存储卷,其中公共存储卷存储要由群集计算服务器的多个计算节点(或多个虚拟服务器)访问和使用的数据。 相比之下,专用存储卷存储仅由对应的计算节点(或虚拟服务器)访问的数据。

    Thread assignment for power and performance efficiency using multiple power states
    20.
    发明授权
    Thread assignment for power and performance efficiency using multiple power states 有权
    使用多个电源状态进行功率和性能效率的线程分配

    公开(公告)号:US09170854B2

    公开(公告)日:2015-10-27

    申请号:US13909789

    申请日:2013-06-04

    Abstract: A method is performed in a computing system that includes a plurality of processing nodes of multiple types configurable to run in multiple performance states. In the method, an application executes on a thread assigned to a first processing node. Power and performance of the application on the first processing node is estimated. Power and performance of the application in multiple performance states on other processing nodes of the plurality of processing nodes besides the first processing node is also estimated. It is determined that the estimated power and performance of the application on a second processing node in a respective performance state of the multiple performance states is preferable to the power and performance of the application on the first processing node. The thread is reassigned to the second processing node, with the second processing node in the respective performance state.

    Abstract translation: 在计算系统中执行一种方法,该计算系统包括多个可配置为以多个执行状态运行的多个处理节点。 在该方法中,应用程序在分配给第一处理节点的线程上执行。 估计第一处理节点上的应用的功率和性能。 还估计除了第一处理节点之外的多个处理节点的其他处理节点上的多个性能状态下的应用的功率和性能。 确定在多个性能状态的各个性能状态下的第二处理节点上的应用的估计功率和性能优于第一处理节点上的应用的功率和性能。 线程被重新分配给第二处理节点,其中第二处理节点处于相应的执行状态。

Patent Agency Ranking