Adaptive quantization for execution of machine learning models

Invention Grant

US11861467B2 Adaptive quantization for execution of machine learning models 有权

Please log in to see more content

Patent Title: Adaptive quantization for execution of machine learning models
Application No.: US16810123

Application Date: 2020-03-05
Publication No.: US11861467B2

Publication Date: 2024-01-02
Inventor: Serag Gadelrab , Karamvir Chatha , Ofer Rosenberg
Applicant: QUALCOMM Incorporated
Applicant Address: US CA San Diego
Assignee: QUALCOMM Incorporated
Current Assignee: QUALCOMM Incorporated
Current Assignee Address: US CA San Diego
Agency: Patterson + Sheridan LLP
Main IPC: G06N20/00
IPC: G06N20/00 ; G06F11/34 ; G06N5/04

Adaptive quantization for execution of machine learning models

Abstract:

Certain aspects of the present disclosure provide techniques for adaptively executing machine learning models on a computing device. An example method generally includes receiving weight information for a machine learning model to be executed on a computing device. The received weight information is reduced into quantized weight information having a reduced bit size relative to the received weight information. First inferences using the machine learning model and the received weight information, and second inferences are performed using the machine learning model and the quantized weight information. Results of the first and second inferences are compared, it is determined that results of the second inferences are within a threshold performance level of results of the first inferences, and based on the determination, one or more subsequent inferences are performed using the machine learning model and the quantized weight information.

Public/Granted literature

US20210279635A1 ADAPTIVE QUANTIZATION FOR EXECUTION OF MACHINE LEARNING MODELS Public/Granted day:2021-09-09

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06N	基于特定计算模型的计算机系统
G06N20/00	机器学习