Training giant neural networks using pipeline parallelism

发明授权

US11232356B2 Training giant neural networks using pipeline parallelism 有权

请登陆查看更多内容

专利标题： Training giant neural networks using pipeline parallelism
申请号： US16989787

申请日： 2020-08-10
公开(公告)号： US11232356B2

公开(公告)日： 2022-01-25
发明人: Zhifeng Chen , Yanping Huang , Youlong Cheng , HyoukJoong Lee , Dehao Chen , Jiquan Ngiam
申请人： Google LLC
申请人地址： US CA Mountain View
专利权人： Google LLC
当前专利权人： Google LLC
当前专利权人地址： US CA Mountain View
代理机构： Fish & Richardson P.C.
主分类号： G06N3/08
IPC分类号： G06N3/08 ; G06N3/04

Training giant neural networks using pipeline parallelism

摘要：

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training giant neural networks. One of the methods includes obtaining data specifying a partitioning of the neural network into N composite layers that form a sequence of composite layers, wherein each composite layer comprises a distinct plurality of layers from the multiple network layers of the neural network; obtaining data assigning each of the N composite layers to one or more computing devices from a set of N computing devices; partitioning a mini-batch of training examples into a plurality of micro-batches; and training the neural network, comprising: performing a forward pass through the neural network until output activations have been computed for each micro-batch for a final composite layer in the sequence, and performing a backward pass through the neural network until output gradients have been computed for each micro-batch for the first composite layer in the sequence.

公开/授权文献

US20210042620A1 TRAINING GIANT NEURAL NETWORKS USING PIPELINE PARALLELISM 公开/授权日：2021-02-11

信息查询

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06N	基于特定计算模型的计算机系统
G06N3/00	基于生物学模型的计算机系统
G06N3/02	.采用神经网络模型
G06N3/08	..学习方法