ASYMMETRIC QUANTIZATION FOR COMPRESSION AND FOR ACCELERATION OF INFERENCE FOR NEURAL NETWORKS
摘要:
Presented herein are embodiments of an improved asymmetric quantization, which may generally be referred to as improved asymmetric quantization (IAQ) embodiments. IAQ embodiments combine the benefits of conventional asymmetric quantization and symmetric quantization but also provide additional computation efficiencies. Embodiments of IAQ adopt an asymmetric range of the weights of a neural network layer, so they circumvent the limitation of symmetric range of symmetric quantization. Moreover, the inference process of a neural network quantized by an IAQ embodiment is much faster than that of the neural network quantized by conventional asymmetric quantization by quantizing an offset value of each layer.
信息查询
0/0