Neural Network Parameter Quantization Method and Apparatus

    公开(公告)号:US20250117637A1

    公开(公告)日:2025-04-10

    申请号:US18961921

    申请日:2024-11-27

    Abstract: A neural network parameter quantization method includes obtaining a parameter of each neuron in a to-be-quantized model to obtain a parameter set, clustering parameters in the parameter set to obtain types of classified data, and quantizing each type of classified data in the types of classified data to obtain at least one type of quantization parameter, where the at least one type of quantization parameter is used to obtain a compression model, and precision of the at least one type of quantization parameter is lower than precision of a parameter in the to-be-quantized model.

Patent Agency Ranking