-
公开(公告)号:US20240095870A1
公开(公告)日:2024-03-21
申请号:US18307728
申请日:2023-04-26
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Diman Zad Tootaghaj , Junguk Cho , Puneet Sharma
IPC: G06T1/20
CPC classification number: G06T1/20
Abstract: Example implementations relate to scheduling of jobs for a plurality of graphics processing units (GPUs) providing concurrent processing by a plurality of virtual GPUs. According to an example, a computing system including one or more GPUs receives a request to schedule a new job to be executed by the computing system. The new job is allocated to one or more vGPUs. Allocations of existing jobs are updated to one or more vGPUs. Operational cost of operating the one or more GPUs and migration cost of allocating the new job are minimized and allocations of the existing jobs on the one or more vGPUs is updated. The new job and the existing jobs are processed by the one or more GPUs in the computing system.
-
公开(公告)号:US12067420B2
公开(公告)日:2024-08-20
申请号:US17077962
申请日:2020-10-22
Applicant: HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP
Inventor: Junguk Cho , Puneet Sharma , Dominik Stiller
CPC classification number: G06F9/5011 , B60W50/00 , G05D1/0212 , G06N20/00 , G08G1/20 , G08G1/202
Abstract: Systems and methods are provided for improving autotuning procedures. For example, the system can implement a task launcher, a scheduler, and an agent to launch, schedule, and execute decomposed autotuning stages, respectively. The scheduling policy implemented by the scheduler may perform operations beyond a simple scheduling policy (e.g., a FIFO-based scheduling policy), which produces a high queuing delay. By leveraging autotuning specific domain knowledge, this may help reduce queuing delay and improve resource utilization that is otherwise found in traditional systems.
-
公开(公告)号:US20230089925A1
公开(公告)日:2023-03-23
申请号:US17448299
申请日:2021-09-21
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Junguk Cho , Puneet Sharma , Diman Zad Tootaghaj
Abstract: Architectures and techniques for managing heterogeneous sets of physical GPUs. Functionality information is collected for one or more physical GPUs with a GPU device manager coupled with a heterogeneous set of physical GPUs. At least one of the physical GPUs is to be managed as multiple virtual GPUs based on the collected functionality information with the GPU device manager. Each of the physical GPUs is classified as either a single physical GPU or as one or more virtual GPUs with the device manager. Traffic representing processing jobs to be processed is received by at least a subset of the physical GPUs via a gateway programmed by a traffic manager. The GPU application to process received processing jobs scheduled by and distributed into the scheduled GPU application with a GPU scheduler communicatively coupled with the traffic manager and with the GPU device manager.
-
公开(公告)号:US20210184942A1
公开(公告)日:2021-06-17
申请号:US16931850
申请日:2020-07-17
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Diman Zad Tootaghaj , Junguk Cho , Puneet Sharma
Abstract: Example implementations relate to a proactive auto-scaling approach. According to an example, a machine-learning prediction model is trained to forecast future serverless workloads during a window of time for an application running in a public cloud based on past serverless workload information associated with the application by performing a training process. During the window of time, serverless workload information associated with the application is monitored. A future serverless workload is predicted for the application at a future time within the window, based on the machine learning prediction model. Prior to the future time, containers within the public cloud executing the application are pre-warmed to accommodate the predicted future serverless workload by issuing fake requests to the application to trigger auto-scaling functionality implemented by the public cloud.
-
公开(公告)号:US20220414817A1
公开(公告)日:2022-12-29
申请号:US17360122
申请日:2021-06-28
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Diman Zad Tootaghaj , Junguk Cho , Puneet Sharma
IPC: G06T1/20
Abstract: Example implementations relate to scheduling of jobs for a plurality of graphics processing units (GPUs) providing concurrent processing by a plurality of virtual GPUs. According to an example, a computing system including one or more GPUs receives a request to schedule a new job to be executed by the computing system. The new job is allocated to one or more vGPUs. Allocations of existing jobs are updated to one or more vGPUs. Operational cost of operating the one or more GPUs and migration cost of allocating the new job are minimized and allocations of the existing jobs on the one or more vGPUs is updated. The new job and the existing jobs are processed by the one or more GPUs in the computing system.
-
公开(公告)号:US11651470B2
公开(公告)日:2023-05-16
申请号:US17360122
申请日:2021-06-28
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Diman Zad Tootaghaj , Junguk Cho , Puneet Sharma
IPC: G06T1/20
CPC classification number: G06T1/20
Abstract: Example implementations relate to scheduling of jobs for a plurality of graphics processing units (GPUs) providing concurrent processing by a plurality of virtual GPUs. According to an example, a computing system including one or more GPUs receives a request to schedule a new job to be executed by the computing system. The new job is allocated to one or more vGPUs. Allocations of existing jobs are updated to one or more vGPUs. Operational cost of operating the one or more GPUs and migration cost of allocating the new job are minimized and allocations of the existing jobs on the one or more vGPUs is updated. The new job and the existing jobs are processed by the one or more GPUs in the computing system.
-
公开(公告)号:US11303534B2
公开(公告)日:2022-04-12
申请号:US16714637
申请日:2019-12-13
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Diman Zad Tootaghaj , Junguk Cho , Puneet Sharma
IPC: H04L12/24 , G06N20/10 , G05B6/02 , H04L41/16 , H04L41/5009 , H04L41/5054 , H04L41/147
Abstract: Example implementations relate to a proactive auto-scaling approach. According to an example, a target performance metric for an application running in a serverless framework of a private cloud is received. A machine-learning prediction model is trained to forecast future serverless workloads during a window of time for the application based on historical serverless workload information. The serverless framework is monitored to obtain serverless workload observations for the application. A future serverless workload for the application at a future time is predicted by the trained machine learning prediction model based on workload observations. A feedback control system is then used to output a new number of replicas based on a current value of the performance metric, the target performance metric and the predicted future serverless workload. Finally, the serverless framework is caused to scale and pre-warm a number of replicas supporting the application to the new number.
-
公开(公告)号:US20210184941A1
公开(公告)日:2021-06-17
申请号:US16714637
申请日:2019-12-13
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Diman Zad Tootaghaj , Junguk Cho , Puneet Sharma
Abstract: Example implementations relate to a proactive auto-scaling approach. According to an example, a target performance metric for an application running in a serverless framework of a private cloud is received. A machine-learning prediction model is trained to forecast future serverless workloads during a window of time for the application based on historical serverless workload information. The serverless framework is monitored to obtain serverless workload observations for the application. A future serverless workload for the application at a future time is predicted by the trained machine learning prediction model based on workload observations. A feedback control system is then used to output a new number of replicas based on a current value of the performance metric, the target performance metric and the predicted future serverless workload. Finally, the serverless framework is caused to scale and pre-warm a number of replicas supporting the application to the new number.
-
-
-
-
-
-
-