OPTIMIZING COST AND PERFORMANCE FOR SERVERLESS DATA ANALYTICS WORKLOADS

    公开(公告)号:US20240289180A1

    公开(公告)日:2024-08-29

    申请号:US18175411

    申请日:2023-02-27

    CPC classification number: G06F9/5083 G06F9/5016 G06F9/5027

    Abstract: Systems and methods are provided for optimizing a serverless workflow. Given a directed acyclic graph (“DAG”) defining functional relationships and a gamma tuning factor to indicate a preference between cost and performance, a serverless workflow corresponding to the DAG may be optimized. The optimization is carried out in accordance with the gamma tuning factor, and is carried out in sub-segments of the DAG called stages. In addition, systems for allowing disparate types of storage media to be utilized by a serverless platform to store data are disclosed. The serverless platforms maintain visibility of the storage media types underlying persistent volumes, and may store data in partitions across disparate types of storage media. For instance, one item of data may be stored partially at a byte addressed storage media and partially at a block addressed storage media.

    SYSTEMS AND METHODS OF RESOURCE CONFIGURATION OPTIMIZATION FOR MACHINE LEARNING WORKLOADS

    公开(公告)号:US20210357256A1

    公开(公告)日:2021-11-18

    申请号:US16874479

    申请日:2020-05-14

    Abstract: Systems and methods are provided for optimally allocating resources used to perform multiple tasks/jobs, e.g., machine learning training jobs. The possible resource configurations or candidates that can be used to perform such jobs are generated. A first batch of training jobs can be randomly selected and run using one of the possible resource configuration candidates. Subsequent batches of training jobs may be performed using other resource configuration candidates that have been selected using an optimization process, e.g., Bayesian optimization. Upon reaching a stopping criterion, the resource configuration resulting in a desired optimization metric, e.g., fastest job completion time can be selected and used to execute the remaining training jobs.

    SYSTEMS AND METHODS OF RESOURCE CONFIGURATION OPTIMIZATION FOR MACHINE LEARNING WORKLOADS

    公开(公告)号:US20220292303A1

    公开(公告)日:2022-09-15

    申请号:US17199294

    申请日:2021-03-11

    Abstract: Systems and methods can be configured to determine a plurality of computing resource configurations used to perform machine learning model training jobs. A computing resource configuration can comprise: a first tuple including numbers of worker nodes and parameter server nodes, and a second tuple including resource allocations for the worker nodes and parameter server nodes. At least one machine learning training job can be executed using a first computing resource configuration having a first set of values associated with the first tuple. During the executing the machine learning training job: resource usage of the worker nodes and parameter server nodes caused by a second set of values associated with the second tuple can be monitored, and whether to adjust the second set of values can be determined. Whether a stopping criterion is satisfied can be determined. One of the plurality of computing resource configurations can be selected.

Patent Agency Ranking