-
公开(公告)号:US11966766B2
公开(公告)日:2024-04-23
申请号:US17076393
申请日:2020-10-21
Applicant: Google LLC
Inventor: Chang Lan , Soroush Radpour
CPC classification number: G06F9/45558 , G06N3/08 , G06N3/084 , G06F2009/45562 , G06F2009/4557 , G06N3/098
Abstract: A data processing system, that includes: one or more host processing devices, the one or more host processing devices may be configured to support instantiation of a plurality of virtual machines such that a first set of virtual machines run one or more worker processes, each worker process operating on a respective data set to produce a respective gradient. The host processing devices may be configured to support instantiation of a second set of virtual machines running one or more reducer processes that operate on each respective gradient produced by each worker process to produce an aggregated gradient. The one or more reducer processes may cause the aggregated gradient to be broadcasted to each worker process.
-
公开(公告)号:US20240112027A1
公开(公告)日:2024-04-04
申请号:US18477546
申请日:2023-09-28
Applicant: Google LLC
Inventor: Yanqi Zhou , Yanping Huang , Yifeng Lu , Andrew M. Dai , Siamak Shakeri , Zhifeng Chen , James Laudon , Quoc V. Le , Da Huang , Nan Du , David Richard So , Daiyi Peng , Yingwei Cui , Jeffrey Adgate Dean , Chang Lan
IPC: G06N3/08
CPC classification number: G06N3/08
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing neural architecture search for machine learning models. In one aspect, a method comprises receiving training data for a machine learning, generating a plurality of candidate neural networks for performing the machine learning task, wherein each candidate neural network comprises a plurality of instances of a layer block composed of a plurality of layers, for each candidate neural network, selecting a respective type for each of the plurality of layers from a set of layer types that comprises, training the candidate neural network and evaluating performance scores for the trained candidate neural networks as applied to the machine learning task, and determining a final neural network for performing the machine learning task based at least on the performance scores for the candidate neural networks.
-
公开(公告)号:US20240311402A1
公开(公告)日:2024-09-19
申请号:US18136634
申请日:2023-04-19
Applicant: GOOGLE LLC
Inventor: Martin Baeuml , Yanping Huang , Wenhao Jia , Chang Lan , Yuanzhong Xu , Junwhan Ahn , Alexander Bailey , Leif Schelin , Trevor Strohman , Emanuel Taropa , Sidharth Mudgal , Yanyan Zheng , Zhifeng Chen , Ahmad Beirami
IPC: G06F16/332 , G06F40/40
CPC classification number: G06F16/3322 , G06F16/3329 , G06F40/40
Abstract: Implementations relate to reducing latency in generating and/or rendering natural language (NL) output generated using a large language model (LLM). Processor(s) of a system can: receive NL based input associated with a client device, and generate the NL based output utilizing the LLM. The NL based output can be a stream of NL based output in that it includes a plurality of segments, and is generated on a segment-by-segment basis. In some implementations, a first segment of the stream of NL based output is selected for inclusion in the stream of NL based output as a second segment (and any subsequent segment) is being generated to reduce latency in evaluating the NL based output as a whole prior to rendering thereof. In some versions of those implementations, the first segment is rendered as the second segment (and any subsequent segment) is being generated to further reduce latency in rendering thereof.
-
公开(公告)号:US20230409889A1
公开(公告)日:2023-12-21
申请号:US17842910
申请日:2022-06-17
Applicant: Google LLC
Inventor: Salem Elie Haykal , Arvind Krishnamurthy , Chang Lan , Soroush Radpour
CPC classification number: G06N3/063 , G06N3/08 , G06N3/0472
Abstract: Aspects of the disclosure are directed to performing disaggregation-aware model graph partitioning, which can include provisioning and load balancing disaggregated resource pools, such as general purpose processors, accelerators, general purpose memory, and high bandwidth memory. Across these disaggregated resource pools, machine learning model operations can be packed and/or batched. The partitioning can further include automatically tuning runtime parameters.
-
公开(公告)号:US20220121465A1
公开(公告)日:2022-04-21
申请号:US17076393
申请日:2020-10-21
Applicant: Google LLC
Inventor: Chang Lan , Soroush Radpour
Abstract: A data processing system, that includes: one or more host processing devices, the one or more host processing devices may be configured to support instantiation of a plurality of virtual machines such that a first set of virtual machines run one or more worker processes, each worker process operating on a respective data set to produce a respective gradient. The host processing devices may be configured to support instantiation of a second set of virtual machines running one or more reducer processes that operate on each respective gradient produced by each worker process to produce an aggregated gradient. The one or more reducer processes may cause the aggregated gradient to be broadcasted to each worker process.
-
-
-
-