-
公开(公告)号:US20230117033A1
公开(公告)日:2023-04-20
申请号:US18084948
申请日:2022-12-20
Applicant: Samsung Electronics Co., Ltd.
Inventor: Junhaeng LEE , Seungwon LEE , Sangwon HA , Wonjo LEE
Abstract: A method of generating a fixed-point quantized neural network includes analyzing a statistical distribution for each channel of floating-point parameter values of feature maps and a kernel for each channel from data of a pre-trained floating-point neural network, determining a fixed-point expression of each of the parameters for each channel statistically covering a distribution range of the floating-point parameter values based on the statistical distribution for each channel, determining fractional lengths of a bias and a weight for each channel among the parameters of the fixed-point expression for each channel based on a result of performing a convolution operation, and generating a fixed-point quantized neural network in which the bias and the weight for each channel have the determined fractional lengths.
-
公开(公告)号:US20240185029A1
公开(公告)日:2024-06-06
申请号:US18437370
申请日:2024-02-09
Applicant: Samsung Electronics Co., Ltd.
Inventor: Wonjo LEE , Seungwon LEE , Junhaeng LEE
Abstract: According to a method and apparatus for neural network quantization, a quantized neural network is generated by performing learning of a neural network, obtaining weight differences between an initial weight and an updated weight determined by the learning of each cycle for each of layers in the first neural network, analyzing a statistic of the weight differences for each of the layers, determining one or more layers, from among the layers, to be quantized with a lower-bit precision based on the analyzed statistic, and generating a second neural network by quantizing the determined one or more layers with the lower-bit precision.
-
公开(公告)号:US20210081798A1
公开(公告)日:2021-03-18
申请号:US16835532
申请日:2020-03-31
Applicant: Samsung Electronics Co., Ltd.
Inventor: Minkyoung CHO , Wonjo LEE , Seungwon LEE
Abstract: A method and apparatus for the pruning of a neural network is provided. The method sets a weight threshold value based on a weight distribution of layers included in a neural network, predicts a change of inference accuracy of a neural network by pruning of each layer based on the weight threshold value, determines a current subject layer to be pruned with a weight threshold value among the layers included in the neural network, and prunes a determined current subject layer.
-
公开(公告)号:US20240346317A1
公开(公告)日:2024-10-17
申请号:US18752163
申请日:2024-06-24
Applicant: Samsung Electronics Co., Ltd.
Inventor: Minkyoung CHO , Wonjo LEE , Seungwon LEE
Abstract: A method and apparatus for the pruning of a neural network is provided. The method sets a weight threshold value based on a weight distribution of layers included in a neural network, predicts a change of inference accuracy of a neural network by pruning of each layer based on the weight threshold value, determines a current subject layer to be pruned with a weight threshold value among the layers included in the neural network, and prunes a determined current subject layer.
-
公开(公告)号:US20230206031A1
公开(公告)日:2023-06-29
申请号:US18116553
申请日:2023-03-02
Applicant: Samsung Electronics Co., Ltd.
Inventor: Wonjo LEE , Seungwon LEE , Junhaeng LEE
Abstract: According to a method and apparatus for neural network quantization, a quantized neural network is generated by performing learning of a neural network, obtaining weight differences between an initial weight and an updated weight determined by the learning of each cycle for each of layers in the first neural network, analyzing a statistic of the weight differences for each of the layers, determining one or more layers, from among the layers, to be quantized with a lower-bit precision based on the analyzed statistic, and generating a second neural network by quantizing the determined one or more layers with the lower-bit precision.
-
公开(公告)号:US20200218962A1
公开(公告)日:2020-07-09
申请号:US16738338
申请日:2020-01-09
Applicant: Samsung Electronics Co., Ltd.
Inventor: Wonjo LEE , Seungwon LEE , Junhaeng LEE
Abstract: According to a method and apparatus for neural network quantization, a quantized neural network is generated by performing learning of a neural network, obtaining weight differences between an initial weight and an updated weight determined by the learning of each cycle for each of layers in the first neural network, analyzing a statistic of the weight differences for each of the layers, determining one or more layers, from among the layers, to be quantized with a lower-bit precision based on the analyzed statistic, and generating a second neural network by quantizing the determined one or more layers with the lower-bit precision.
-
公开(公告)号:US20190130250A1
公开(公告)日:2019-05-02
申请号:US16168418
申请日:2018-10-23
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Hyunsun PARK , Wonjo LEE , Sehwan LEE , Seungwon LEE
Abstract: A process-implemented neural network method includes obtaining a plurality of kernels and an input feature map; determining a pruning index indicating a weight location where pruning is to be performed commonly within the plurality of kernels; and performing a Winograd-based convolution operation by pruning a weight corresponding to the determined pruning index with respect to each of the plurality of kernels.
-
公开(公告)号:US20190042948A1
公开(公告)日:2019-02-07
申请号:US16051788
申请日:2018-08-01
Applicant: Samsung Electronics Co., Ltd.
Inventor: Junhaeng LEE , Seungwon LEE , Sangwon HA , Wonjo LEE
Abstract: A method of generating a fixed-point quantized neural network includes analyzing a statistical distribution for each channel of floating-point parameter values of feature maps and a kernel for each channel from data of a pre-trained floating-point neural network, determining a fixed-point expression of each of the parameters for each channel statistically covering a distribution range of the floating-point parameter values based on the statistical distribution for each channel, determining fractional lengths of a bias and a weight for each channel among the parameters of the fixed-point expression for each channel based on a result of performing a convolution operation, and generating a fixed-point quantized neural network in which the bias and the weight for each channel have the determined fractional lengths.
-
-
-
-
-
-
-