Robust gradient weight compression schemes for deep learning applications

Invention Grant

US11295208B2 Robust gradient weight compression schemes for deep learning applications 有权

Please log in to see more content

Patent Title: Robust gradient weight compression schemes for deep learning applications
Application No.: US15830170

Application Date: 2017-12-04
Publication No.: US11295208B2

Publication Date: 2022-04-05
Inventor: Ankur Agrawal , Daniel Brand , Chia-Yu Chen , Jungwook Choi , Kailash Gopalakrishnan
Applicant: International Business Machines Corporation
Applicant Address: US NY Armonk
Assignee: International Business Machines Corporation
Current Assignee: International Business Machines Corporation
Current Assignee Address: US NY Armonk
Agency: Cantor Colburn LLP
Agent Stosch Sabo
Main IPC: G06N3/08
IPC: G06N3/08 ; G06N3/04

Robust gradient weight compression schemes for deep learning applications

Abstract:

Embodiments of the present invention provide a computer-implemented method for adaptive residual gradient compression for training of a deep learning neural network (DNN). The method includes obtaining, by a first learner, a current gradient vector for a neural network layer of the DNN, in which the current gradient vector includes gradient weights of parameters of the neural network layer that are calculated from a mini-batch of training data. A current residue vector is generated that includes residual gradient weights for the mini-batch. A compressed current residue vector is generated based on dividing the residual gradient weights of the current residue vector into a plurality of bins of a uniform size and quantizing a subset of the residual gradient weights of one or more bins of the plurality of bins. The compressed current residue vector is then transmitted to a second learner of the plurality of learners or to a parameter server.

Public/Granted literature

US20190171935A1 ROBUST GRADIENT WEIGHT COMPRESSION SCHEMES FOR DEEP LEARNING APPLICATIONS Public/Granted day:2019-06-06

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06N	基于特定计算模型的计算机系统
G06N3/00	基于生物学模型的计算机系统
G06N3/02	.采用神经网络模型
G06N3/08	..学习方法