-
公开(公告)号:US20240220783A1
公开(公告)日:2024-07-04
申请号:US18431455
申请日:2024-02-02
Applicant: Samsung Electronics Co., Ltd.
Inventor: Arun CHAUHAN , Utsav TIWARI , Vikram Nelvoy RAJENDIRAN , Payal ANAND , Hitesh KUMAR , Rohit SAXENA
IPC: G06N3/0495
CPC classification number: G06N3/0495
Abstract: A method for mixed precision quantization of an artificial intelligence (AI) model by an electronic device is included. The method includes performing, by the electronic device, perturbation in weights of each layer of a plurality of layers of the AI model for a pre-defined number of times, determining, by the electronic device, a change in an output of each layer of a plurality of layers of the AI model based on a perturbation in weights of each layer of the plurality of layers, determining, by the electronic device, a sensitivity metric for each layer of the plurality of layers of the AI model as a measure of the change in the output of each layer, assigning, by the electronic device, a bit-precision to each layer of the plurality of layers of the AI model based on the determined sensitivity metric, and performing, by the electronic device, the mixed precision quantization of the AI model using the bit-precision assigned to each layer of the plurality of layers of the AI model.
-
2.
公开(公告)号:US20210232921A1
公开(公告)日:2021-07-29
申请号:US17159598
申请日:2021-01-27
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Akshay PARASHAR , Arun ABRAHAM , Payal ANAND , Deepthy RAVI , Venkappa MALA , Vikram Nelvoy RAJENDIRAN
Abstract: A method, an apparatus, and a system for configuring a neural network across heterogeneous processors are provided. The method includes creating a unified neural network profile for the plurality of processors; receiving at least one request to perform at least one task using the neural network; determining a type of the requested at least one task as one of an asynchronous task and a synchronous task; and parallelizing processing of the neural network across the plurality of processors to perform the requested at least one task, based on the type of the requested at least one task and the created unified neural network profile.
-