Method and apparatus with neural network parameter quantization

    公开(公告)号:US11481608B2

    公开(公告)日:2022-10-25

    申请号:US16909095

    申请日:2020-06-23

    Abstract: A processor-implemented neural network method includes: determining a respective probability density function (PDF) of normalizing a statistical distribution of parameter values, for each channel of each of a plurality of feature maps of a pre-trained neural network; determining, for each channel, a corresponding first quantization range for performing quantization of corresponding parameter values, based on a quantization error and a quantization noise of the respective determined PDF; determining, for each channel, a corresponding second quantization range, based on a signal-to-quantization noise ratio (SQNR) of the respective determined PDF; correcting, for each channel, the corresponding first quantization range based on the corresponding second quantization range; and generating a quantized neural network, based on the corrected first quantization range corresponding for each channel.

    Method and apparatus with neural network parameter quantization

    公开(公告)号:US11816557B2

    公开(公告)日:2023-11-14

    申请号:US17950342

    申请日:2022-09-22

    Abstract: A processor-implemented neural network method includes: determining a respective probability density function (PDF) of normalizing a statistical distribution of parameter values, for each channel of each of a plurality of feature maps of a pre-trained neural network; determining, for each channel, a corresponding first quantization range for performing quantization of corresponding parameter values, based on a quantization error and a quantization noise of the respective determined PDF; determining, for each channel, a corresponding second quantization range, based on a signal-to-quantization noise ratio (SQNR) of the respective determined PDF; correcting, for each channel, the corresponding first quantization range based on the corresponding second quantization range; and generating a quantized neural network, based on the corrected first quantization range corresponding for each channel.

Patent Agency Ranking