-
公开(公告)号:US20240256835A1
公开(公告)日:2024-08-01
申请号:US18424420
申请日:2024-01-26
Applicant: Google LLC
Inventor: Mostafa Dehghani , Josip Djolonga , Jonathan Heek , Basil Mustafa , Piotr Michal Padlewski , Justin Morgan Gilmer , Neil Matthew Tinmouth Houlsby
IPC: G06N3/0455 , G06N3/088
CPC classification number: G06N3/0455 , G06N3/088
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing an input through each of a plurality of layers of a neural network to generate an output using a plurality of hardware accelerators. The plurality of layers comprise a fully connected layer having a plurality of parameters arranged in a row dimension and a column dimension. One of the methods comprises: generating a plurality of parameter blocks by partitioning the plurality of parameters along the row dimension and the column dimension; determining a ratio of a number of parameters along the row dimension relative to a number of parameters along the column dimension; and determining whether to use row sharding or column sharding with the plurality of hardware accelerators to calculate an output for the fully connected layer and then calculating the output for the fully connected layer using either row sharding or column sharding.