PRUNING ACTIVATIONS AND WEIGHTS OF NEURAL NETWORKS WITH PROGRAMMABLE THRESHOLDS

    公开(公告)号:US20230394312A1

    公开(公告)日:2023-12-07

    申请号:US18453715

    申请日:2023-08-22

    CPC classification number: G06N3/082 G06N3/0464

    Abstract: Activations (e.g., output activations) or weights of intermediate layers of deep neural networks (DNNs) can be pruned to increase sparsity and reduce the amount of computation required for performing the computations in the layers or subsequent layers. A pruning threshold may be determined, e.g., through an iterative process, and activations or weights having absolute values lower than the pruning threshold may be changed to zero. A first pruning threshold may be used to prune an output tensor or kernel of a layer. The loss in the accuracy of the DNN due to the pruning may be determined. A second pruning threshold may be determined based on the first pruning threshold and the accuracy loss. The DNN may be modified by adding a pruning operation to the layer. The pruning operation can prune output tensors or kernels of the layer based on the second pruning threshold.

Patent Agency Ranking