-
公开(公告)号:US20210073028A1
公开(公告)日:2021-03-11
申请号:US16600437
申请日:2019-10-11
Applicant: Google LLC
Inventor: Sheng Li , Brian Zhang , Liqun Cheng , Norman Paul Jouppi , Yun Ni
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for scheduling operations represented as a computational graph on a distributed computing network. A method includes: receiving data representing operations to be executed in order to perform a job on a plurality of hardware accelerators of a plurality of different accelerator types; generating, for the job and from at least the data representing the operations, features that represent a predicted performance for the job on hardware accelerators of the plurality of different accelerator types; generating, from the features, a respective predicted performance metric for the job for each of the plurality of different accelerator types according to a performance objective function; and providing, to a scheduling system, one or more recommendations for scheduling the job on one or more recommended types of hardware accelerators.
-
公开(公告)号:US20250077833A1
公开(公告)日:2025-03-06
申请号:US18821971
申请日:2024-08-30
Applicant: Google LLC
Inventor: Sheng Li , Norman Paul Jouppi , Quoc V. Le , Mingxing Tan , Ruoming Pang , Liqun Cheng , Andrew Li
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining an architecture for a task neural network that is configured to perform a particular machine learning task on a target set of hardware resources. When deployed on a target set of hardware, such as a collection of datacenter accelerators, the task neural network may be capable of performing the particular machine learning task with enhanced accuracy and speed.
-
公开(公告)号:US12131244B2
公开(公告)日:2024-10-29
申请号:US17039178
申请日:2020-09-30
Applicant: Google LLC
Inventor: Sheng Li , Norman Paul Jouppi , Quoc V. Le , Mingxing Tan , Ruoming Pang , Liqun Cheng , Andrew Li
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining an architecture for a task neural network that is configured to perform a particular machine learning task on a target set of hardware resources. When deployed on a target set of hardware, such as a collection of datacenter accelerators, the task neural network may be capable of performing the particular machine learning task with enhanced accuracy and speed.
-
公开(公告)号:US20240037373A1
公开(公告)日:2024-02-01
申请号:US17875594
申请日:2022-07-28
Applicant: Google LLC
Inventor: Sheng Li , Norman Paul Jouppi , Garrett Axel Andersen , Quoc V. Le , Liqun Cheng , Parthasarathy Ranganathan
CPC classification number: G06N3/0454 , G06N3/063
Abstract: Aspects of the disclosure are directed to jointly searching machine learning model architectures and hardware architectures in a combined space of models, hardware, and mapping strategies. A search strategy is utilized where all models, hardware, and mappings are evaluated together at once via weight sharing and a supernetwork. A multi-objective reward function is utilized with objectives for quality, performance, power, and area.
-
公开(公告)号:US20220230048A1
公开(公告)日:2022-07-21
申请号:US17175029
申请日:2021-02-12
Applicant: Google LLC
Inventor: Andrew Li , Sheng Li , Mingxing Tan , Ruoming Pang , Liqun Cheng , Quoc V. Le , Norman Paul Jouppi
Abstract: Methods, systems, and apparatus, including computer-readable media, for scaling neural network architectures on hardware accelerators. A method includes receiving training data and information specifying target computing resources, and performing using the training data, a neural architecture search over a search space to identify an architecture for a base neural network. A plurality of scaling parameter values for scaling the base neural network can be identified, which can include repeatedly selecting a plurality of candidate scaling parameter values, and determining a measure of performance for the base neural network scaled according to the plurality of candidate scaling parameter values, in accordance with a plurality of second objectives including a latency objective. An architecture for a scaled neural network can be determined using the architecture of the base neural network scaled according to the plurality of scaling parameter values.
-
6.
公开(公告)号:US20230297580A1
公开(公告)日:2023-09-21
申请号:US17721873
申请日:2022-04-15
Applicant: Google LLC
Inventor: Sheng Li , Garrett Axel Andersen , Norman Paul Jouppi , Quoc V. Le , Liqun Cheng , Parthasarathy Ranganathan , Julian Paul Grady , Yang Li , Martin Wicke , Yifeng Lu , Yun Ni , Kun Wang
IPC: G06F16/2457 , G06F16/2455 , G06N3/063
CPC classification number: G06F16/2457 , G06F16/24554 , G06N3/063
Abstract: According to various implementations, generally disclosed herein is a hybrid and hierarchical neural architecture search (NAS) approach. The approach includes performing a search space partitioning scheme to divide the search space into sub-search spaces. The approach further includes performing a first type of NAS, such as a Multi-trial NAS, to cover a search across the sub-search spaces. The approach also includes performing a second type of NAS, such as a One-Shot NAS, to cover each sub-search space. The approach further includes automatically stopping the second type of NAS based on one or more early stopping criteria.
-
公开(公告)号:US20230222000A1
公开(公告)日:2023-07-13
申请号:US18091951
申请日:2022-12-30
Applicant: Google LLC
Inventor: Sheng Li , Brian Zhang , Liqun Cheng , Norman Paul Jouppi , Yun Ni
CPC classification number: G06F9/5044 , G06F9/545 , G06F11/3612 , G06F9/5066 , G06F18/214 , G06N3/08 , G06F9/5011 , G06F11/3409 , G06F9/4881 , G06F2209/501
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for scheduling operations represented as a computational graph on a distributed computing network. A method includes: receiving data representing operations to be executed in order to perform a job on a plurality of hardware accelerators of a plurality of different accelerator types; generating, for the job and from at least the data representing the operations, features that represent a predicted performance for the job on hardware accelerators of the plurality of different accelerator types; generating, from the features, a respective predicted performance metric for the job for each of the plurality of different accelerator types according to a performance objective function; and providing, to a scheduling system, one or more recommendations for scheduling the job on one or more recommended types of hardware accelerators.
-
公开(公告)号:US20240231667A1
公开(公告)日:2024-07-11
申请号:US18152428
申请日:2023-01-10
Applicant: Google LLC
Inventor: Sheng Li , Sridhar Lakshmanamurthy , Norman Paul Jouppi , Martin Guy Dixon , Daniel Stodolsky , Quoc V. Le , Liqun Cheng , Erik Karl Norden , Parthasarathy Ranganathan
IPC: G06F3/06
CPC classification number: G06F3/0647 , G06F3/0611 , G06F3/067
Abstract: Aspects of the disclosure are directed to a heterogeneous machine learning accelerator system with compute and memory nodes connected by high speed chip-to-chip interconnects. While existing remote/disaggregated memory may require memory expansion via remote processing units, aspects of the disclosure add memory nodes into machine learning accelerator clusters via the chip-to-chip interconnects without needing assistance from remote processing units to achieve higher performance, simpler software stack, and/or lower cost. The memory nodes may support prefetch and intelligent compression to enable the use of low cost memory without performance degradation.
-
公开(公告)号:US11544105B2
公开(公告)日:2023-01-03
申请号:US16600437
申请日:2019-10-11
Applicant: Google LLC
Inventor: Sheng Li , Brian Zhang , Liqun Cheng , Norman Paul Jouppi , Yun Ni
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for scheduling operations represented as a computational graph on a distributed computing network. A method includes: receiving data representing operations to be executed in order to perform a job on a plurality of hardware accelerators of a plurality of different accelerator types; generating, for the job and from at least the data representing the operations, features that represent a predicted performance for the job on hardware accelerators of the plurality of different accelerator types; generating, from the features, a respective predicted performance metric for the job for each of the plurality of different accelerator types according to a performance objective function; and providing, to a scheduling system, one or more recommendations for scheduling the job on one or more recommended types of hardware accelerators.
-
公开(公告)号:US20220019869A1
公开(公告)日:2022-01-20
申请号:US17039178
申请日:2020-09-30
Applicant: Google LLC
Inventor: Sheng Li , Norman Paul Jouppi , Quoc V. Le , Mingxing Tan , Ruoming Pang , Liqun Cheng , Andrew Li
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining an architecture for a task neural network that is configured to perform a particular machine learning task on a target set of hardware resources. When deployed on a target set of hardware, such as a collection of datacenter accelerators, the task neural network may be capable of performing the particular machine learning task with enhanced accuracy and speed.
-
-
-
-
-
-
-
-
-