SCHEDULING JOBS ON GRAPHICAL PROCESSING UNITS

    公开(公告)号:US20240095870A1

    公开(公告)日:2024-03-21

    申请号:US18307728

    申请日:2023-04-26

    CPC classification number: G06T1/20

    Abstract: Example implementations relate to scheduling of jobs for a plurality of graphics processing units (GPUs) providing concurrent processing by a plurality of virtual GPUs. According to an example, a computing system including one or more GPUs receives a request to schedule a new job to be executed by the computing system. The new job is allocated to one or more vGPUs. Allocations of existing jobs are updated to one or more vGPUs. Operational cost of operating the one or more GPUs and migration cost of allocating the new job are minimized and allocations of the existing jobs on the one or more vGPUs is updated. The new job and the existing jobs are processed by the one or more GPUs in the computing system.

    ASSIGNING JOBS TO HETEROGENEOUS GRAPHICS PROCESSING UNITS

    公开(公告)号:US20230089925A1

    公开(公告)日:2023-03-23

    申请号:US17448299

    申请日:2021-09-21

    Abstract: Architectures and techniques for managing heterogeneous sets of physical GPUs. Functionality information is collected for one or more physical GPUs with a GPU device manager coupled with a heterogeneous set of physical GPUs. At least one of the physical GPUs is to be managed as multiple virtual GPUs based on the collected functionality information with the GPU device manager. Each of the physical GPUs is classified as either a single physical GPU or as one or more virtual GPUs with the device manager. Traffic representing processing jobs to be processed is received by at least a subset of the physical GPUs via a gateway programmed by a traffic manager. The GPU application to process received processing jobs scheduled by and distributed into the scheduled GPU application with a GPU scheduler communicatively coupled with the traffic manager and with the GPU device manager.

    PROACTIVELY ACCOMODATING PREDICTED FUTURE SERVERLESS WORKLOADS USING A MACHINE LEARNING PREDICTION MODEL

    公开(公告)号:US20210184942A1

    公开(公告)日:2021-06-17

    申请号:US16931850

    申请日:2020-07-17

    Abstract: Example implementations relate to a proactive auto-scaling approach. According to an example, a machine-learning prediction model is trained to forecast future serverless workloads during a window of time for an application running in a public cloud based on past serverless workload information associated with the application by performing a training process. During the window of time, serverless workload information associated with the application is monitored. A future serverless workload is predicted for the application at a future time within the window, based on the machine learning prediction model. Prior to the future time, containers within the public cloud executing the application are pre-warmed to accommodate the predicted future serverless workload by issuing fake requests to the application to trigger auto-scaling functionality implemented by the public cloud.

    SCHEDULING JOBS ON GRAPHICAL PROCESSING UNITS

    公开(公告)号:US20220414817A1

    公开(公告)日:2022-12-29

    申请号:US17360122

    申请日:2021-06-28

    Abstract: Example implementations relate to scheduling of jobs for a plurality of graphics processing units (GPUs) providing concurrent processing by a plurality of virtual GPUs. According to an example, a computing system including one or more GPUs receives a request to schedule a new job to be executed by the computing system. The new job is allocated to one or more vGPUs. Allocations of existing jobs are updated to one or more vGPUs. Operational cost of operating the one or more GPUs and migration cost of allocating the new job are minimized and allocations of the existing jobs on the one or more vGPUs is updated. The new job and the existing jobs are processed by the one or more GPUs in the computing system.

    Scheduling jobs on graphical processing units

    公开(公告)号:US11651470B2

    公开(公告)日:2023-05-16

    申请号:US17360122

    申请日:2021-06-28

    CPC classification number: G06T1/20

    Abstract: Example implementations relate to scheduling of jobs for a plurality of graphics processing units (GPUs) providing concurrent processing by a plurality of virtual GPUs. According to an example, a computing system including one or more GPUs receives a request to schedule a new job to be executed by the computing system. The new job is allocated to one or more vGPUs. Allocations of existing jobs are updated to one or more vGPUs. Operational cost of operating the one or more GPUs and migration cost of allocating the new job are minimized and allocations of the existing jobs on the one or more vGPUs is updated. The new job and the existing jobs are processed by the one or more GPUs in the computing system.

    PROACTIVELY ACCOMODATING PREDICTED FUTURE SERVERLESS WORKLOADS USING A MACHINE LEARNING PREDICTION MODEL AND A FEEDBACK CONTROL SYSTEM

    公开(公告)号:US20210184941A1

    公开(公告)日:2021-06-17

    申请号:US16714637

    申请日:2019-12-13

    Abstract: Example implementations relate to a proactive auto-scaling approach. According to an example, a target performance metric for an application running in a serverless framework of a private cloud is received. A machine-learning prediction model is trained to forecast future serverless workloads during a window of time for the application based on historical serverless workload information. The serverless framework is monitored to obtain serverless workload observations for the application. A future serverless workload for the application at a future time is predicted by the trained machine learning prediction model based on workload observations. A feedback control system is then used to output a new number of replicas based on a current value of the performance metric, the target performance metric and the predicted future serverless workload. Finally, the serverless framework is caused to scale and pre-warm a number of replicas supporting the application to the new number.

Patent Agency Ranking