-
公开(公告)号:US11301762B1
公开(公告)日:2022-04-12
申请号:US16179217
申请日:2018-11-02
Applicant: Amazon Technologies, Inc.
Inventor: Gang Chen , Long Gao , Eduardo Manuel Calleja
Abstract: Techniques for high-performance machine learning (ML) inference in heterogenous edge devices are described. A ML model trained using a variety of different frameworks is translated into a common format that is runnable by inferences engines of edge devices. The translated model is optimized in hardware-agnostic and/or hardware-specific ways to improve inference performance, and the optimized model is sent to the edge devices. The inference engine for any edge device can be accessed by a customer application using a same defined API, regardless of the hardware characteristics of the edge device or the original format of the ML model.
-
公开(公告)号:US10990850B1
公开(公告)日:2021-04-27
申请号:US16217400
申请日:2018-12-12
Applicant: Amazon Technologies, Inc.
Inventor: Gang Chen , William Shannon Fu , Long Gao
Abstract: Techniques for machine learning (ML) model knowledge distillation and automatic retraining are described. A model adaptation controller obtains samples generated by an edge device and inference values generated based on the samples by a deployed ML model of the edge device. The model adaptation controller runs inference on the samples using a different ML model to generate inferences that can be used to determine whether the performance of the deployed ML model is lacking. If so, the model adaptation controller can retrain the deployed ML model using samples with ground truth values generated by the different ML model, resulting in a light-weight retrained model that can be provisioned to the edge device. This retraining process may be performed iteratively to automatically improve and adapt the ML model running at the edge device.
-
公开(公告)号:US11704577B1
公开(公告)日:2023-07-18
申请号:US17716945
申请日:2022-04-08
Applicant: Amazon Technologies, Inc.
Inventor: Gang Chen , Long Gao , Eduardo Manuel Calleja
CPC classification number: G06N5/027 , G06F16/116 , G06N20/00
Abstract: Techniques for high-performance machine learning (ML) inference in heterogenous edge devices are described. A ML model trained using a variety of different frameworks is translated into a common format that is runnable by inferences engines of edge devices. The translated model is optimized in hardware-agnostic and/or hardware-specific ways to improve inference performance, and the optimized model is sent to the edge devices. The inference engine for any edge device can be accessed by a customer application using a same defined API, regardless of the hardware characteristics of the edge device or the original format of the ML model.
-
-