-
公开(公告)号:US20230102510A1
公开(公告)日:2023-03-30
申请号:US18057455
申请日:2022-11-21
发明人: Hao HUANG , Zhenghua YANG , Long QIU , Ashish PINNINTI , Juan Diego FERRE , Amit Anand AMLESHWARAM
摘要: Techniques for machine learning inferencing endpoint discovery in a distributed computing system are discloses herein. In one example, a method includes searching a database containing machine learning endpoint records having data representing values of execution latency or prediction accuracy corresponding inferencing endpoints deployed in the distributed computing system. The method also includes generating a list of inferencing endpoints matching the individual target values and determining whether a count of the inferencing endpoints in the generated list exceeds a preset threshold. In response to determining that the identified count does not exceed the preset threshold, the method includes instantiating one or more additional inferencing endpoints in the distributed computing system based on the individual target values in the received query.