-
公开(公告)号:US11521039B2
公开(公告)日:2022-12-06
申请号:US16168418
申请日:2018-10-23
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Hyunsun Park , Wonjo Lee , Sehwan Lee , Seungwon Lee
IPC: G06N3/04 , G06N3/08 , G06F17/15 , G06N5/04 , G06F16/901
Abstract: A process-implemented neural network method includes obtaining a plurality of kernels and an input feature map; determining a pruning index indicating a weight location where pruning is to be performed commonly within the plurality of kernels; and performing a Winograd-based convolution operation by pruning a weight corresponding to the determined pruning index with respect to each of the plurality of kernels.
-
公开(公告)号:US12045723B2
公开(公告)日:2024-07-23
申请号:US16835532
申请日:2020-03-31
Applicant: Samsung Electronics Co., Ltd.
Inventor: Minkyoung Cho , Wonjo Lee , Seungwon Lee
Abstract: A method and apparatus for the pruning of a neural network is provided. The method sets a weight threshold value based on a weight distribution of layers included in a neural network, predicts a change of inference accuracy of a neural network by pruning of each layer based on the weight threshold value, determines a current subject layer to be pruned with a weight threshold value among the layers included in the neural network, and prunes a determined current subject layer.
-
公开(公告)号:US11625577B2
公开(公告)日:2023-04-11
申请号:US16738338
申请日:2020-01-09
Applicant: Samsung Electronics Co., Ltd.
Inventor: Wonjo Lee , Seungwon Lee , Junhaeng Lee
Abstract: According to a method and apparatus for neural network quantization, a quantized neural network is generated by performing learning of a neural network, obtaining weight differences between an initial weight and an updated weight determined by the learning of each cycle for each of layers in the first neural network, analyzing a statistic of the weight differences for each of the layers, determining one or more layers, from among the layers, to be quantized with a lower-bit precision based on the analyzed statistic, and generating a second neural network by quantizing the determined one or more layers with the lower-bit precision.
-
公开(公告)号:US11588496B2
公开(公告)日:2023-02-21
申请号:US16051788
申请日:2018-08-01
Applicant: Samsung Electronics Co., Ltd.
Inventor: Junhaeng Lee , Seungwon Lee , Sangwon Ha , Wonjo Lee
Abstract: A method of generating a fixed-point quantized neural network includes analyzing a statistical distribution for each channel of floating-point parameter values of feature maps and a kernel for each channel from data of a pre-trained floating-point neural network, determining a fixed-point expression of each of the parameters for each channel statistically covering a distribution range of the floating-point parameter values based on the statistical distribution for each channel, determining fractional lengths of a bias and a weight for each channel among the parameters of the fixed-point expression for each channel based on a result of performing a convolution operation, and generating a fixed-point quantized neural network in which the bias and the weight for each channel have the determined fractional lengths.
-
公开(公告)号:US11934939B2
公开(公告)日:2024-03-19
申请号:US18116553
申请日:2023-03-02
Applicant: Samsung Electronics Co., Ltd.
Inventor: Wonjo Lee , Seungwon Lee , Junhaeng Lee
Abstract: According to a method and apparatus for neural network quantization, a quantized neural network is generated by performing learning of a neural network, obtaining weight differences between an initial weight and an updated weight determined by the learning of each cycle for each of layers in the first neural network, analyzing a statistic of the weight differences for each of the layers, determining one or more layers, from among the layers, to be quantized with a lower-bit precision based on the analyzed statistic, and generating a second neural network by quantizing the determined one or more layers with the lower-bit precision.
-
-
-
-