Systems and methods for reducing stranded power capacity

    公开(公告)号:US11307627B2

    公开(公告)日:2022-04-19

    申请号:US16863217

    申请日:2020-04-30

    Abstract: Systems and methods described herein make previously stranded power capacity (power that is provisioned for a data center according to a computing system's nameplate power consumption but is currently not useable) available to the data center. Systems described herein generate empirical power profiles that specify expected upper bounds for the power consumption levels that applications trigger. Using the upper bounds for application power-consumption levels, a computing system described herein can reliably release part of its provisioned nameplate power for other systems or data center consumers, reducing the amount of stranded power in a data center. The method described herein avoids performance penalties for most jobs by using sensor measurements made at a rapid rate explained herein to ensure that a system power cap based on running application's measured peak power consumption is reliable with reference to the power capacitance inherent in the computing system.

    Job agent-based power-capping of systems

    公开(公告)号:US12001256B2

    公开(公告)日:2024-06-04

    申请号:US17195864

    申请日:2021-03-09

    Abstract: A technique includes an agent executing on a plurality of nodes while a job is being concurrently executed by the plurality of nodes. The plurality of nodes is power-capped by an existing node power consumption budget. The technique includes managing power consumption of the plurality of nodes. The managing includes the agent determining a performance footprint that is associated with execution of the job; and the managing includes the agent determining a second node power consumption budget based on the performance footprint. The second node power consumption budget is different than the existing node power consumption budget. The managing includes the agent providing a power consumption request to a global power dispatcher to set a new node power consumption budget for the plurality of nodes.

    OPTIMIZING OPERATION OF HIGH-PERFORMANCE COMPUTING SYSTEMS

    公开(公告)号:US20240095081A1

    公开(公告)日:2024-03-21

    申请号:US17948159

    申请日:2022-09-19

    Abstract: A method for optimizing operations of high-performance computing (HPC) systems includes collecting data associated with a plurality of workload performance profiling counters associated with a workload during runtime of the workload in an HPC system. Based on the collected data, the method includes using a machine-learning technique to classify the workload by determining a workload-specific fingerprint for the workload. The method includes identifying an optimization metric to optimize during running of the workload in the HPC system. The method includes determining an optimal setting for a plurality of tunable hardware execution parameters as measured against the optimization metric by varying at least a portion of the plurality of tunable hardware execution parameters. The method includes storing the workload-specific fingerprint, the optimization metric, and the optimal setting for the plurality of tunable hardware execution parameters as measured against the optimization metric in an architecture-specific knowledge database.

    JOB AGENT-BASED POWER-CAPPING OF SYSTEMS

    公开(公告)号:US20220291734A1

    公开(公告)日:2022-09-15

    申请号:US17195864

    申请日:2021-03-09

    Abstract: A technique includes an agent executing on a plurality of nodes while a job is being concurrently executed by the plurality of nodes. The plurality of nodes is power-capped by an existing node power consumption budget. The technique includes managing power consumption of the plurality of nodes. The managing includes the agent determining a performance footprint that is associated with execution of the job; and the managing includes the agent determining a second node power consumption budget based on the performance footprint. The second node power consumption budget is different than the existing node power consumption budget. The managing includes the agent providing a power consumption request to a global power dispatcher to set a new node power consumption budget for the plurality of nodes.

Patent Agency Ranking