TRAINING NEURAL NETWORKS USING DISTRIBUTED BATCH NORMALIZATION

Invention Application

US20240378416A1 TRAINING NEURAL NETWORKS USING DISTRIBUTED BATCH NORMALIZATION 有权

Please log in to see more content

Patent Title: TRAINING NEURAL NETWORKS USING DISTRIBUTED BATCH NORMALIZATION
Application No.: US18444267

Application Date: 2024-02-16
Publication No.: US20240378416A1

Publication Date: 2024-11-14
Inventor: Blake Alan Hechtman , Sameer Kumar
Applicant: Google LLC
Applicant Address: US CA Mountain View
Assignee: Google LLC
Current Assignee: Google LLC
Current Assignee Address: US CA Mountain View
Main IPC: G06N3/044
IPC: G06N3/044 ; G06N3/04 ; G06N3/08 ; G06N3/084 ; G06V10/82

TRAINING NEURAL NETWORKS USING DISTRIBUTED BATCH NORMALIZATION

Abstract:

Methods, systems, and apparatus, including instructions encoded on storage media, for performing reduction of gradient vectors for distributed training of a neural network. One of the methods includes receiving, at each of the plurality of devices, a respective batch; performing, by each device, a forward pass comprising, for each batch normalization layer: generating, by each of the devices, a respective output of the corresponding other layer for each training example in the batch, determining, by each of the devices, a per-replica mean and a per-replica variance; determining, for each sub-group, a distributed mean and a distributed variance from the per-replica means and the per-replica variances for the devices in the sub-group; and applying, by each device, batch normalization to the respective outputs of the corresponding other layer generated by the device using the distributed mean and the distributed variance for the sub-group to which the device belongs.

Information query

Global Dossier Espacenet