Dynamic kernel slicing for VGPU sharing in serverless computing systems

    公开(公告)号:US11113782B2

    公开(公告)日:2021-09-07

    申请号:US16601831

    申请日:2019-10-15

    Applicant: VMware, Inc.

    Abstract: Various examples are disclosed for dynamic kernel slicing for virtual graphics processing unit (vGPU) sharing in serverless computing systems. A computing device is configured to provide a serverless computing service, receive a request for execution of program code in the serverless computing service in which a plurality of virtual graphics processing units (vGPUs) are used in the execution of the program code, determine a slice size to partition a compute kernel of the program code into a plurality of sub-kernels for concurrent execution by the vGPUs, the slice size being determined for individual ones of the sub-kernels based on an optimization function that considers a load on a GPU, determine an execution schedule for executing the individual ones of the sub-kernels on the vGPUs in accordance with a scheduling policy, and execute the sub-kernels on the vGPUs as partitioned in accordance with the execution schedule.

    DYNAMIC KERNEL SLICING FOR VGPU SHARING IN SERVERLESS COMPUTING SYSTEMS

    公开(公告)号:US20210110506A1

    公开(公告)日:2021-04-15

    申请号:US16601831

    申请日:2019-10-15

    Applicant: VMware, Inc.

    Abstract: Various examples are disclosed for dynamic kernel slicing for virtual graphics processing unit (vGPU) sharing in serverless computing systems. A computing device is configured to provide a serverless computing service, receive a request for execution of program code in the serverless computing service in which a plurality of virtual graphics processing units (vGPUs) are used in the execution of the program code, determine a slice size to partition a compute kernel of the program code into a plurality of sub-kernels for concurrent execution by the vGPUs, the slice size being determined for individual ones of the sub-kernels based on an optimization function that considers a load on a GPU, determine an execution schedule for executing the individual ones of the sub-kernels on the vGPUs in accordance with a scheduling policy, and execute the sub-kernels on the vGPUs as partitioned in accordance with the execution schedule.

    VIRTUAL PROCESSING UNIT SCHEDULING IN A COMPUTING SYSTEM

    公开(公告)号:US20250039093A1

    公开(公告)日:2025-01-30

    申请号:US18380218

    申请日:2023-10-16

    Applicant: VMWARE, INC.

    Abstract: An example computer system includes a hardware platform including a processing unit and software executing on the hardware platform. The software includes a workload and a scheduler, the workload including a network function chain having network functions, the scheduler configured to schedule the network functions for execution on the processing unit. A downstream network function includes a congestion monitor configured to monitor a first receive queue supplying packets to the downstream network function, the congestion monitor configured to compare occupancy of the first receive queue against a queue threshold. An upstream network function including a rate controller configured to receive a notification from the congestion monitor generated in response to the occupancy of the first receive queue exceeding the queue threshold, the rate controller configured to modify a rate of packet flow between a second receive queue and the upstream network function in response to the notification.

    MEMORY-AWARE PLACEMENT FOR VIRTUAL GPU ENABLED SYSTEMS

    公开(公告)号:US20220253341A1

    公开(公告)日:2022-08-11

    申请号:US17733284

    申请日:2022-04-29

    Applicant: VMware, Inc.

    Abstract: Disclosed are aspects of memory-aware placement in systems that include graphics processing units (GPUs) that are virtual GPU (vGPU) enabled. In some examples, graphics processing units (GPU) are identified in a computing environment. Graphics processing requests are received. A graphics processing request includes a GPU memory requirement. The graphics processing requests are processed using a graphics processing request placement model that minimizes a number of utilized GPUs that are utilized to accommodate the requests. Virtual GPUs (vGPUs) are created to accommodate the graphics processing requests according to the graphics processing request placement model. The utilized GPUs divide their GPU memories to provide a subset of the plurality of vGPUs.

    Memory-aware placement for virtual GPU enabled systems

    公开(公告)号:US11263054B2

    公开(公告)日:2022-03-01

    申请号:US16550327

    申请日:2019-08-26

    Applicant: VMWARE, INC.

    Abstract: Disclosed are aspects of memory-aware placement in systems that include graphics processing units (GPUs) that are virtual GPU (vGPU) enabled. In some embodiments, a computing environment is monitored to identify graphics processing unit (GPU) data for a plurality of virtual GPU (vGPU) enabled GPUs of the computing environment, a plurality of vGPU requests are received. A respective vGPU request includes a GPU memory requirement. GPU configurations are determined in order to accommodate vGPU requests. The GPU configurations are determined based on an integer linear programming (ILP) vGPU request placement model. Configured vGPU profiles are applied for vGPU enabled GPUs, and vGPUs are created based on the configured vGPU profiles. The vGPU requests are assigned to the vGPUs.

Patent Agency Ranking