Invention Grant
- Patent Title: Adaptive quantization for execution of machine learning models
-
Application No.: US16810123Application Date: 2020-03-05
-
Publication No.: US11861467B2Publication Date: 2024-01-02
- Inventor: Serag Gadelrab , Karamvir Chatha , Ofer Rosenberg
- Applicant: QUALCOMM Incorporated
- Applicant Address: US CA San Diego
- Assignee: QUALCOMM Incorporated
- Current Assignee: QUALCOMM Incorporated
- Current Assignee Address: US CA San Diego
- Agency: Patterson + Sheridan LLP
- Main IPC: G06N20/00
- IPC: G06N20/00 ; G06F11/34 ; G06N5/04

Abstract:
Certain aspects of the present disclosure provide techniques for adaptively executing machine learning models on a computing device. An example method generally includes receiving weight information for a machine learning model to be executed on a computing device. The received weight information is reduced into quantized weight information having a reduced bit size relative to the received weight information. First inferences using the machine learning model and the received weight information, and second inferences are performed using the machine learning model and the quantized weight information. Results of the first and second inferences are compared, it is determined that results of the second inferences are within a threshold performance level of results of the first inferences, and based on the determination, one or more subsequent inferences are performed using the machine learning model and the quantized weight information.
Public/Granted literature
- US20210279635A1 ADAPTIVE QUANTIZATION FOR EXECUTION OF MACHINE LEARNING MODELS Public/Granted day:2021-09-09
Information query