Invention Grant
- Patent Title: Jointly pruning and quantizing deep neural networks
-
Application No.: US16396619Application Date: 2019-04-26
-
Publication No.: US11475308B2Publication Date: 2022-10-18
- Inventor: Georgios Georgiadis , Weiran Deng
- Applicant: Samsung Electronics Co., Ltd.
- Applicant Address: KR Suwon-si
- Assignee: Samsung Electronics Co., Ltd.
- Current Assignee: Samsung Electronics Co., Ltd.
- Current Assignee Address: KR Suwon-si
- Agency: Renaissance IP Law Group LLP
- Main IPC: G06N3/08
- IPC: G06N3/08 ; G06N3/04

Abstract:
A system and a method generate a neural network that includes at least one layer having weights and output feature maps that have been jointly pruned and quantized. The weights of the layer are pruned using an analytic threshold function. Each weight remaining after pruning is quantized based on a weighted average of a quantization and dequantization of the weight for all quantization levels to form quantized weights for the layer. Output feature maps of the layer are generated based on the quantized weights of the layer. Each output feature map of the layer is quantized based on a weighted average of a quantization and dequantization of the output feature map for all quantization levels. Parameters of the analytic threshold function, the weighted average of all quantization levels of the weights and the weighted average of each output feature map of the layer are updated using a cost function.
Public/Granted literature
- US20200293893A1 JOINTLY PRUNING AND QUANTIZING DEEP NEURAL NETWORKS Public/Granted day:2020-09-17
Information query