Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Michael PHAM"

1.

发明申请
CACHING IN A MACHINE LEARNING MODEL HOSTING SERVICE 有权

公开(公告)号：US20250110784A1

公开(公告)日：2025-04-03

申请号：US18478185

申请日：2023-09-29

Applicant: Amazon Technologies, Inc.

Inventor： Deepti Laxman RAGHA , Pratyush Kumar RANJAN , Michael PHAM , Maximiliano MACCANTI

IPC: G06F9/50

Abstract: Techniques for caching in a machine learning model (ML) hosting service are described. ML model usage data is aggregated from host usage data provided from each host of a first set of hosts, the ML model usage data including, for a particular ML model, a number of inference requests to the particular ML model. A priority order of hosts in a second set of hosts to service an inference request for the particular ML model is calculated. Based on the ML model usage data and the priority order, a set of ML models to load to a particular host in the second set of hosts is determined. The particular host is caused to load the set of ML models. A router is updated to direct ML model inference requests amongst the second set of hosts.

Patent Agency Ranking