-
公开(公告)号:US20220129759A1
公开(公告)日:2022-04-28
申请号:US17441622
申请日:2019-06-26
Applicant: Intel Corporation
Inventor: Anbang YAO , Aojun ZHOU , Dawei SUN , Dian GU , Yurong CHEN
Abstract: Apparatuses, methods, and GPUs are disclosed for universal loss-error-aware quantization (ULQ) of a neural network (NN). In one example, an apparatus includes data storage to store data including activation sets and weight sets, and a network processor coupled to the data storage. The network processor is configured to implement the ULQ by constraining a low-precision NN model based on a full-precision NN model, to perform a loss-error-aware activation quantization to quantize activation sets into ultra-low-bit versions with given bit-width values, to optimize the NN with respect to a loss function that is based on the full-precision NN model, and to perform a loss-error-aware weight quantization to quantize weight sets into ultra-low-bit versions.
-
公开(公告)号:US20220147791A1
公开(公告)日:2022-05-12
申请号:US17435657
申请日:2019-06-21
Applicant: Intel Corporation
Inventor: Anbang YAO , Jiahui ZHANG , Dawei SUN , Dian GU , Yurong CHEN
IPC: G06N3/04
Abstract: Embodiments are generally directed to sparse 3D convolution acceleration in a convolutional layer of an artificial neural network model. An embodiment of an apparatus includes one or more processors including a graphics processor to process data; and a memory for storage of data, including feature maps. The one or more processors are to provide for sparse 3D convolution acceleration by applying a shared 3D convolutional kernel/filter to an input feature map to produce an output feature map, including increasing sparsity of the input feature map by partitioning it into multiple disjoint input groups; generation of multiple disjoint output groups corresponding to the input groups by performing a convolution calculation represented by the shared 3D convolutional kernel/filter on all feature values associated with active/valid voxels of each input group to produce corresponding feature values within corresponding output groups; and outputting the output feature map by sequentially stacking the output groups.
-