发明申请
- 专利标题: ROBUST GRADIENT WEIGHT COMPRESSION SCHEMES FOR DEEP LEARNING APPLICATIONS
-
申请号: US15830170申请日: 2017-12-04
-
公开(公告)号: US20190171935A1公开(公告)日: 2019-06-06
- 发明人: Ankur Agrawal , Daniel Brand , Chia-Yu Chen , Jungwook Choi , Kailash Gopalakrishnan
- 申请人: International Business Machines Corporation
- 主分类号: G06N3/08
- IPC分类号: G06N3/08 ; G06N3/04
摘要:
Embodiments of the present invention provide a computer-implemented method for adaptive residual gradient compression for training of a deep learning neural network (DNN). The method includes obtaining, by a first learner, a current gradient vector for a neural network layer of the DNN, in which the current gradient vector includes gradient weights of parameters of the neural network layer that are calculated from a mini-batch of training data. A current residue vector is generated that includes residual gradient weights for the mini-batch. A compressed current residue vector is generated based on dividing the residual gradient weights of the current residue vector into a plurality of bins of a uniform size and quantizing a subset of the residual gradient weights of one or more bins of the plurality of bins. The compressed current residue vector is then transmitted to a second learner of the plurality of learners or to a parameter server.
公开/授权文献
信息查询