-
11.
公开(公告)号:US20220382601A1
公开(公告)日:2022-12-01
申请号:US17334592
申请日:2021-05-28
Applicant: salesforce.com, inc.
Inventor: Yuliya L. Feldman , Seyedshahin Ashrafzadeh , Alexandr Nikitin , Manoj Agarwal
Abstract: A machine learning serving infrastructure implementing a method of receiving or detecting an update of container metrics including resource usage and serviced requests per model or per container, processing the container metrics per model or per container to determine recent resource usage and serviced requests per model or per container, and rebalancing distribution of models to a plurality of containers to decrease a detected load imbalance between containers or a stressed container in the plurality of containers.
-
公开(公告)号:US20220237505A1
公开(公告)日:2022-07-28
申请号:US17159639
申请日:2021-01-27
Applicant: salesforce.com, inc.
Inventor: Yuliya L. Feldman , Seyedshahin Ashrafzadeh , Alexandr Nikitin , Manoj Agarwal
Abstract: Using container information to select containers for executing models is described. A system receives a request from an application and identifies a version of a machine-learning model associated with the request. The system identifies a set of each serving container corresponding to the machine-learning model from a cluster of available serving containers associated with the version of the machine-learning model. The system selects a serving container from the set of each serving container corresponding to the machine-learning model. If the machine-learning model is not loaded in the serving container, the system loads the machine-learning model in the serving container. If the machine-learning model is loaded in the serving container, the system executes, in the serving container, the machine-learning model on behalf of the request. The system responds to the request based on executing the machine-learning model on behalf of the request.
-