Invention Grant
- Patent Title: Robust gradient weight compression schemes for deep learning applications
-
Application No.: US15830170Application Date: 2017-12-04
-
Publication No.: US11295208B2Publication Date: 2022-04-05
- Inventor: Ankur Agrawal , Daniel Brand , Chia-Yu Chen , Jungwook Choi , Kailash Gopalakrishnan
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: Cantor Colburn LLP
- Agent Stosch Sabo
- Main IPC: G06N3/08
- IPC: G06N3/08 ; G06N3/04

Abstract:
Embodiments of the present invention provide a computer-implemented method for adaptive residual gradient compression for training of a deep learning neural network (DNN). The method includes obtaining, by a first learner, a current gradient vector for a neural network layer of the DNN, in which the current gradient vector includes gradient weights of parameters of the neural network layer that are calculated from a mini-batch of training data. A current residue vector is generated that includes residual gradient weights for the mini-batch. A compressed current residue vector is generated based on dividing the residual gradient weights of the current residue vector into a plurality of bins of a uniform size and quantizing a subset of the residual gradient weights of one or more bins of the plurality of bins. The compressed current residue vector is then transmitted to a second learner of the plurality of learners or to a parameter server.
Public/Granted literature
- US20190171935A1 ROBUST GRADIENT WEIGHT COMPRESSION SCHEMES FOR DEEP LEARNING APPLICATIONS Public/Granted day:2019-06-06
Information query