Method to balance sparsity for efficient inference of deep neural networks
Abstract:
A system and method that provides balanced pruning of weights of a deep neural network (DNN) in which weights of the DNN are partitioned into a plurality of groups, a count of a number of non-zero weights is determined in each group, a variance of the count of weights in each group is determined, a loss function of the DNN is minimized using Lagrange multipliers with a constraint that the variance of the count of weights in each group is equal to 0, and the weights and the Lagrange multipliers are retrained by back-propagation.
Information query
Patent Agency Ranking
0/0