-
公开(公告)号:US20240220768A1
公开(公告)日:2024-07-04
申请号:US18285977
申请日:2022-04-06
Applicant: Google LLC
Inventor: Dan Zhang , Safeen Huda , Azalia Mirhoseini , Anna Darling Goldie , Ebrahim Songhori
IPC: G06N3/04
CPC classification number: G06N3/04
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining a hardware datapath for a hardware accelerator computer chip.
-
公开(公告)号:US20240370693A1
公开(公告)日:2024-11-07
申请号:US18285578
申请日:2022-04-06
Applicant: Google LLC
Inventor: Dan Zhang , Safeen Huda , Azalia Mirhoseini , Anna Darling Goldie , Ebrahim Songhori
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining a hardware datapath for a hardware accelerator computer chip.
-
公开(公告)号:US20230162010A1
公开(公告)日:2023-05-25
申请号:US17532572
申请日:2021-11-22
Applicant: Google LLC
Inventor: Azalia Mirhoseini , Safeen Huda , Martin Christoph Maas , Paras Jagdish Jain , Jeffrey Adgate Dean
CPC classification number: G06N3/063 , G06F15/8046 , G06F11/3409 , G06F11/3062 , G06F11/3024
Abstract: Systems and methods are provided for designing approximate, low-power deep learning accelerator chips that have little to no accuracy loss when executing a deep learning model. A set of approximate systolic arrays may be generated. The performance of each approximate systolic array in the set of approximate systolic arrays processing a deep neural network (DNN) may be determined. Each layer in the DNN may be mapped to an approximate systolic array in the set of approximate systolic arrays. A subset of the set of approximate systolic arrays may be selected for inclusion in the inference chip design based on the mapping and the performance of each approximate systolic array in the set of approximate systolic arrays.
-
公开(公告)号:US20230119235A1
公开(公告)日:2023-04-20
申请号:US17968048
申请日:2022-10-18
Applicant: Google LLC
Inventor: Michael David Hutton , Georgios Konstadinidis , Lluis-Miquel Munguia , Safeen Huda , Gaurav Agrawal
Abstract: A method and system for controlling performance of a workload partitioned among a plurality of accelerator chips of a multi-chip system. One or more processors may receive performance speed data for each of the accelerator chips, obtain a model of the partitioned workload, determine a portion of the workload that is either overworked or underworked based on the model of the partitioned workload and the performance speed data for each of the plurality of accelerator chips, and adjust a performance speed of an accelerator chip that performs the portion of the partitioned workload that is either overworked or underworked.
-
-
-