-
公开(公告)号:US11113782B2
公开(公告)日:2021-09-07
申请号:US16601831
申请日:2019-10-15
Applicant: VMware, Inc.
Inventor: Chandra Prakash , Anshuj Garg , Uday Pundalik Kurkure , Hari Sivaraman , Lan Vu , Sairam Veeraswamy
Abstract: Various examples are disclosed for dynamic kernel slicing for virtual graphics processing unit (vGPU) sharing in serverless computing systems. A computing device is configured to provide a serverless computing service, receive a request for execution of program code in the serverless computing service in which a plurality of virtual graphics processing units (vGPUs) are used in the execution of the program code, determine a slice size to partition a compute kernel of the program code into a plurality of sub-kernels for concurrent execution by the vGPUs, the slice size being determined for individual ones of the sub-kernels based on an optimization function that considers a load on a GPU, determine an execution schedule for executing the individual ones of the sub-kernels on the vGPUs in accordance with a scheduling policy, and execute the sub-kernels on the vGPUs as partitioned in accordance with the execution schedule.
-
公开(公告)号:US20210110506A1
公开(公告)日:2021-04-15
申请号:US16601831
申请日:2019-10-15
Applicant: VMware, Inc.
Inventor: Chandra Prakash , Anshuj Garg , Uday Pundalik Kurkure , Hari Sivaraman , Lan VU , Sairam Veeraswamy
Abstract: Various examples are disclosed for dynamic kernel slicing for virtual graphics processing unit (vGPU) sharing in serverless computing systems. A computing device is configured to provide a serverless computing service, receive a request for execution of program code in the serverless computing service in which a plurality of virtual graphics processing units (vGPUs) are used in the execution of the program code, determine a slice size to partition a compute kernel of the program code into a plurality of sub-kernels for concurrent execution by the vGPUs, the slice size being determined for individual ones of the sub-kernels based on an optimization function that considers a load on a GPU, determine an execution schedule for executing the individual ones of the sub-kernels on the vGPUs in accordance with a scheduling policy, and execute the sub-kernels on the vGPUs as partitioned in accordance with the execution schedule.
-