LOSS-ERROR-AWARE QUANTIZATION OF A LOW-BIT NEURAL NETWORK

    公开(公告)号:US20210019630A1

    公开(公告)日:2021-01-21

    申请号:US16982441

    申请日:2018-07-26

    IPC分类号: G06N3/08 G06K9/62 G06N3/04

    摘要: Methods, apparatus, systems and articles of manufacture for loss-error-aware quantization of a low-bit neural network are disclosed. An example apparatus includes a network weight partitioner to partition unquantized network weights of a first network model into a first group to be quantized and a second group to be retrained. The example apparatus includes a loss calculator to process network weights to calculate a first loss. The example apparatus includes a weight quantizer to quantize the first group of network weights to generate low-bit second network weights. In the example apparatus, the loss calculator is to determine a difference between the first loss and a second loss. The example apparatus includes a weight updater to update the second group of network weights based on the difference. The example apparatus includes a network model deployer to deploy a low-bit network model including the low-bit second network weights.