MIXED PRECISION QUANTIZATION OF AN ARTIFICIAL INTELLIGENCE MODEL

    公开(公告)号:US20240220783A1

    公开(公告)日:2024-07-04

    申请号:US18431455

    申请日:2024-02-02

    CPC classification number: G06N3/0495

    Abstract: A method for mixed precision quantization of an artificial intelligence (AI) model by an electronic device is included. The method includes performing, by the electronic device, perturbation in weights of each layer of a plurality of layers of the AI model for a pre-defined number of times, determining, by the electronic device, a change in an output of each layer of a plurality of layers of the AI model based on a perturbation in weights of each layer of the plurality of layers, determining, by the electronic device, a sensitivity metric for each layer of the plurality of layers of the AI model as a measure of the change in the output of each layer, assigning, by the electronic device, a bit-precision to each layer of the plurality of layers of the AI model based on the determined sensitivity metric, and performing, by the electronic device, the mixed precision quantization of the AI model using the bit-precision assigned to each layer of the plurality of layers of the AI model.

Patent Agency Ranking