Neural network method and apparatus

    公开(公告)号:US12045723B2

    公开(公告)日:2024-07-23

    申请号:US16835532

    申请日:2020-03-31

    CPC classification number: G06N3/082 G06N3/04

    Abstract: A method and apparatus for the pruning of a neural network is provided. The method sets a weight threshold value based on a weight distribution of layers included in a neural network, predicts a change of inference accuracy of a neural network by pruning of each layer based on the weight threshold value, determines a current subject layer to be pruned with a weight threshold value among the layers included in the neural network, and prunes a determined current subject layer.

    Method and apparatus for neural network quantization

    公开(公告)号:US11625577B2

    公开(公告)日:2023-04-11

    申请号:US16738338

    申请日:2020-01-09

    Abstract: According to a method and apparatus for neural network quantization, a quantized neural network is generated by performing learning of a neural network, obtaining weight differences between an initial weight and an updated weight determined by the learning of each cycle for each of layers in the first neural network, analyzing a statistic of the weight differences for each of the layers, determining one or more layers, from among the layers, to be quantized with a lower-bit precision based on the analyzed statistic, and generating a second neural network by quantizing the determined one or more layers with the lower-bit precision.

    Method and apparatus for generating fixed-point quantized neural network

    公开(公告)号:US11588496B2

    公开(公告)日:2023-02-21

    申请号:US16051788

    申请日:2018-08-01

    Abstract: A method of generating a fixed-point quantized neural network includes analyzing a statistical distribution for each channel of floating-point parameter values of feature maps and a kernel for each channel from data of a pre-trained floating-point neural network, determining a fixed-point expression of each of the parameters for each channel statistically covering a distribution range of the floating-point parameter values based on the statistical distribution for each channel, determining fractional lengths of a bias and a weight for each channel among the parameters of the fixed-point expression for each channel based on a result of performing a convolution operation, and generating a fixed-point quantized neural network in which the bias and the weight for each channel have the determined fractional lengths.

    Method and apparatus for neural network quantization

    公开(公告)号:US11934939B2

    公开(公告)日:2024-03-19

    申请号:US18116553

    申请日:2023-03-02

    CPC classification number: G06N3/045 G06N3/047 G06N3/084

    Abstract: According to a method and apparatus for neural network quantization, a quantized neural network is generated by performing learning of a neural network, obtaining weight differences between an initial weight and an updated weight determined by the learning of each cycle for each of layers in the first neural network, analyzing a statistic of the weight differences for each of the layers, determining one or more layers, from among the layers, to be quantized with a lower-bit precision based on the analyzed statistic, and generating a second neural network by quantizing the determined one or more layers with the lower-bit precision.

Patent Agency Ranking