Method and apparatus for optimizing and applying multilayer neural network model, and storage medium
Abstract:
A method and an apparatus for optimizing and applying a multilayer neural network model, and a storage medium are provided. The optimization method includes, dividing out at least one sub-structure from the multilayer neural network model to be optimized, wherein a tail layer of the divided sub-structure is a quantization layer, and transferring operation parameters in layers other than the quantization layer to the quantization layer for each of the divided sub-structures and updating quantization threshold parameters in the quantization layer based on the transferred operation parameters. When a multilayer neural network model optimized based on the optimization method is operated, the necessary processor resources can be reduced.
Information query
Patent Agency Ranking
0/0