METHOD AND APPARATUS FOR GENERATING FIXED-POINT QUANTIZED NEURAL NETWORK

    公开(公告)号:US20230117033A1

    公开(公告)日:2023-04-20

    申请号:US18084948

    申请日:2022-12-20

    Abstract: A method of generating a fixed-point quantized neural network includes analyzing a statistical distribution for each channel of floating-point parameter values of feature maps and a kernel for each channel from data of a pre-trained floating-point neural network, determining a fixed-point expression of each of the parameters for each channel statistically covering a distribution range of the floating-point parameter values based on the statistical distribution for each channel, determining fractional lengths of a bias and a weight for each channel among the parameters of the fixed-point expression for each channel based on a result of performing a convolution operation, and generating a fixed-point quantized neural network in which the bias and the weight for each channel have the determined fractional lengths.

    METHOD AND APPARATUS FOR NEURAL NETWORK QUANTIZATION

    公开(公告)号:US20240185029A1

    公开(公告)日:2024-06-06

    申请号:US18437370

    申请日:2024-02-09

    CPC classification number: G06N3/045 G06N3/047 G06N3/084

    Abstract: According to a method and apparatus for neural network quantization, a quantized neural network is generated by performing learning of a neural network, obtaining weight differences between an initial weight and an updated weight determined by the learning of each cycle for each of layers in the first neural network, analyzing a statistic of the weight differences for each of the layers, determining one or more layers, from among the layers, to be quantized with a lower-bit precision based on the analyzed statistic, and generating a second neural network by quantizing the determined one or more layers with the lower-bit precision.

    NEURAL NETWORK METHOD AND APPARATUS

    公开(公告)号:US20210081798A1

    公开(公告)日:2021-03-18

    申请号:US16835532

    申请日:2020-03-31

    Abstract: A method and apparatus for the pruning of a neural network is provided. The method sets a weight threshold value based on a weight distribution of layers included in a neural network, predicts a change of inference accuracy of a neural network by pruning of each layer based on the weight threshold value, determines a current subject layer to be pruned with a weight threshold value among the layers included in the neural network, and prunes a determined current subject layer.

    NEURAL NETWORK METHOD AND APPARATUS
    4.
    发明公开

    公开(公告)号:US20240346317A1

    公开(公告)日:2024-10-17

    申请号:US18752163

    申请日:2024-06-24

    CPC classification number: G06N3/082 G06N3/04

    Abstract: A method and apparatus for the pruning of a neural network is provided. The method sets a weight threshold value based on a weight distribution of layers included in a neural network, predicts a change of inference accuracy of a neural network by pruning of each layer based on the weight threshold value, determines a current subject layer to be pruned with a weight threshold value among the layers included in the neural network, and prunes a determined current subject layer.

    METHOD AND APPARATUS FOR NEURAL NETWORK QUANTIZATION

    公开(公告)号:US20230206031A1

    公开(公告)日:2023-06-29

    申请号:US18116553

    申请日:2023-03-02

    CPC classification number: G06N3/045 G06N3/084 G06N3/047

    Abstract: According to a method and apparatus for neural network quantization, a quantized neural network is generated by performing learning of a neural network, obtaining weight differences between an initial weight and an updated weight determined by the learning of each cycle for each of layers in the first neural network, analyzing a statistic of the weight differences for each of the layers, determining one or more layers, from among the layers, to be quantized with a lower-bit precision based on the analyzed statistic, and generating a second neural network by quantizing the determined one or more layers with the lower-bit precision.

    METHOD AND APPARATUS FOR NEURAL NETWORK QUANTIZATION

    公开(公告)号:US20200218962A1

    公开(公告)日:2020-07-09

    申请号:US16738338

    申请日:2020-01-09

    Abstract: According to a method and apparatus for neural network quantization, a quantized neural network is generated by performing learning of a neural network, obtaining weight differences between an initial weight and an updated weight determined by the learning of each cycle for each of layers in the first neural network, analyzing a statistic of the weight differences for each of the layers, determining one or more layers, from among the layers, to be quantized with a lower-bit precision based on the analyzed statistic, and generating a second neural network by quantizing the determined one or more layers with the lower-bit precision.

    METHOD AND APPARATUS FOR GENERATING FIXED-POINT QUANTIZED NEURAL NETWORK

    公开(公告)号:US20190042948A1

    公开(公告)日:2019-02-07

    申请号:US16051788

    申请日:2018-08-01

    Abstract: A method of generating a fixed-point quantized neural network includes analyzing a statistical distribution for each channel of floating-point parameter values of feature maps and a kernel for each channel from data of a pre-trained floating-point neural network, determining a fixed-point expression of each of the parameters for each channel statistically covering a distribution range of the floating-point parameter values based on the statistical distribution for each channel, determining fractional lengths of a bias and a weight for each channel among the parameters of the fixed-point expression for each channel based on a result of performing a convolution operation, and generating a fixed-point quantized neural network in which the bias and the weight for each channel have the determined fractional lengths.

Patent Agency Ranking