-
公开(公告)号:US20250148357A1
公开(公告)日:2025-05-08
申请号:US18504016
申请日:2023-11-07
Applicant: Google LLC
Inventor: Aditya Binodkumar Agrawal , Blake Alan Hechtman , Matthew Leever Hedlund , David Alexander Majnemer , Marissa Karen Ikonomidis
IPC: G06N20/00
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for compresses a machine learning model having a plurality of parameters. In one aspect, one of the methods includes obtaining trained values of a set of parameters for at least a portion of a machine learning model; identifying one or more dense ranges for the trained values; determining a least number of bits required to represent each trained value within the one or more dense ranges; identifying a second format having a range that is smaller than a range of the first format; and generating a compressed version of the at least a portion of the machine learning model.
-
公开(公告)号:US20240378416A1
公开(公告)日:2024-11-14
申请号:US18444267
申请日:2024-02-16
Applicant: Google LLC
Inventor: Blake Alan Hechtman , Sameer Kumar
Abstract: Methods, systems, and apparatus, including instructions encoded on storage media, for performing reduction of gradient vectors for distributed training of a neural network. One of the methods includes receiving, at each of the plurality of devices, a respective batch; performing, by each device, a forward pass comprising, for each batch normalization layer: generating, by each of the devices, a respective output of the corresponding other layer for each training example in the batch, determining, by each of the devices, a per-replica mean and a per-replica variance; determining, for each sub-group, a distributed mean and a distributed variance from the per-replica means and the per-replica variances for the devices in the sub-group; and applying, by each device, batch normalization to the respective outputs of the corresponding other layer generated by the device using the distributed mean and the distributed variance for the sub-group to which the device belongs.
-
公开(公告)号:US11907825B2
公开(公告)日:2024-02-20
申请号:US16659543
申请日:2019-10-21
Applicant: Google LLC
Inventor: Blake Alan Hechtman , Sameer Kumar
Abstract: Methods, systems, and apparatus, including instructions encoded on storage media, for performing reduction of gradient vectors for distributed training of a neural network. One of the methods includes receiving, at each of the plurality of devices, a respective batch; performing, by each device, a forward pass comprising, for each batch normalization layer: generating, by each of the devices, a respective output of the corresponding other layer for each training example in the batch, determining, by each of the devices, a per-replica mean and a per-replica variance; determining, for each sub-group, a distributed mean and a distributed variance from the per-replica means and the per-replica variances for the devices in the sub-group; and applying, by each device, batch normalization to the respective outputs of the corresponding other layer generated by the device using the distributed mean and the distributed variance for the sub-group to which the device belongs.
-
公开(公告)号:US20230418797A1
公开(公告)日:2023-12-28
申请号:US18341697
申请日:2023-06-26
Applicant: Google LLC
Inventor: Felix Ren-Chyan Chern , Blake Alan Hechtman , Andrew Thomas Davis , Ruiqi Guo , Sanjiv Kumar , David Alexander Majnemer
CPC classification number: G06F16/2237 , G06F16/285
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing a kNN computation using a hardware accelerator. One of the methods includes obtaining a set of one or more query vectors; obtaining a set of database vectors; and performing, on a hardware accelerator and for each query vector in the set, a search for the k most similar database vectors to the query vector, comprising: computing, by circuitry of the hardware accelerator and for each query vector, a respective similarity value between the query vector and each database vector; and for each query vector, identifying, by the hardware accelerator and for each bin, (i) an index of the most similar database vector within the bin and (ii) the respective similarity value for the most similar database vector within the bin.
-
公开(公告)号:US11763142B2
公开(公告)日:2023-09-19
申请号:US17902776
申请日:2022-09-02
Applicant: Google LLC
IPC: G06N3/063 , G06N3/04 , G06N3/06 , G06N3/10 , G06F17/15 , G06F17/16 , G06F30/18 , G06F30/20 , G06F30/27 , G06F30/367 , G06N3/086 , G06N3/045
CPC classification number: G06N3/063 , G06F17/15 , G06F17/16 , G06F30/18 , G06F30/20 , G06F30/27 , G06F30/367 , G06N3/045 , G06N3/086 , G06N3/10
Abstract: Methods and systems, including computer programs encoded on a computer storage medium. In one aspect, a method includes the actions of receiving a request to perform convolutional computations for a neural network on a hardware circuit having a matrix computation unit, the request specifying the convolutional computation to be performed on a feature tensor and a filter and padding applied to the feature tensor prior to performing the convolutional computation; and generating instructions that when executed by the hardware circuit cause the hardware circuit to perform operations comprising: transferring feature tensor data from a main memory of the hardware circuit to a scratchpad memory of the hardware circuit; and repeatedly performing the following operations: identifying a current subset of the feature tensor; and determining whether a memory view into the scratchpad memory for the current subset is consistent with a memory view of the current subset in the main memory.
-
-
-
-