-
公开(公告)号:US20230342278A1
公开(公告)日:2023-10-26
申请号:US17725825
申请日:2022-04-21
Applicant: Microsoft Technology Licensing, LLC
Inventor: Anand PADMANABHA IYER , Swapnil Sunilkumar GANDHI
CPC classification number: G06F11/3442 , G06F9/505 , G06N5/043 , G06N20/00
Abstract: The present disclosure relates to methods and systems for providing inferences using machine learning systems. The methods and systems receive a load forecast for processing requests by a machine learning model and split the machine learning model into a plurality machine learning model portions based on the load forecast. The methods and systems determine a batch size for the requests for the machine learning model portions. The methods and systems use one or more available resources to execute the plurality of machine learning model portions to process the requests and generate inferences for the requests.