QUANTIZATION METHOD FOR NEURAL NETWORK MODEL AND DEEP LEARNING ACCELERATOR

    公开(公告)号:US20230196094A1

    公开(公告)日:2023-06-22

    申请号:US17560010

    申请日:2021-12-22

    CPC classification number: G06N3/08 G06N3/0481

    Abstract: A quantization method for neural network model includes following steps: initializing a weight array of a neural network model, wherein the weight array includes a plurality of initial weights; performing a quantization procedure to generate a quantized weight array according to the weight array, wherein the quantized weight array includes a plurality of quantized weights within a fixed range; performing a training procedure of the neural network model according to the quantized weight array; and determining whether a loss function is convergent in the training procedure and outputting a post-trained quantized weight array when the loss function is convergent.

Patent Agency Ranking