- 专利标题: Robust gradient weight compression schemes for deep learning applications
-
申请号: US15830170申请日: 2017-12-04
-
公开(公告)号: US11295208B2公开(公告)日: 2022-04-05
- 发明人: Ankur Agrawal , Daniel Brand , Chia-Yu Chen , Jungwook Choi , Kailash Gopalakrishnan
- 申请人: International Business Machines Corporation
- 申请人地址: US NY Armonk
- 专利权人: International Business Machines Corporation
- 当前专利权人: International Business Machines Corporation
- 当前专利权人地址: US NY Armonk
- 代理机构: Cantor Colburn LLP
- 代理商 Stosch Sabo
- 主分类号: G06N3/08
- IPC分类号: G06N3/08 ; G06N3/04
摘要:
Embodiments of the present invention provide a computer-implemented method for adaptive residual gradient compression for training of a deep learning neural network (DNN). The method includes obtaining, by a first learner, a current gradient vector for a neural network layer of the DNN, in which the current gradient vector includes gradient weights of parameters of the neural network layer that are calculated from a mini-batch of training data. A current residue vector is generated that includes residual gradient weights for the mini-batch. A compressed current residue vector is generated based on dividing the residual gradient weights of the current residue vector into a plurality of bins of a uniform size and quantizing a subset of the residual gradient weights of one or more bins of the plurality of bins. The compressed current residue vector is then transmitted to a second learner of the plurality of learners or to a parameter server.
公开/授权文献
信息查询