Patent search ap:("GOOGLE LLC") AND inv:"Chang Lan" Page 1

1.

发明授权
Reduction server for fast distributed training 有权

公开(公告)号：US11966766B2

公开(公告)日：2024-04-23

申请号：US17076393

申请日：2020-10-21

Applicant: Google LLC

Inventor： Chang Lan , Soroush Radpour

IPC: G06F9/455 , G06N3/08 , G06N3/084 , G06N3/098

CPC classification number: G06F9/45558 , G06N3/08 , G06N3/084 , G06F2009/45562 , G06F2009/4557 , G06N3/098

Abstract: A data processing system, that includes: one or more host processing devices, the one or more host processing devices may be configured to support instantiation of a plurality of virtual machines such that a first set of virtual machines run one or more worker processes, each worker process operating on a respective data set to produce a respective gradient. The host processing devices may be configured to support instantiation of a second set of virtual machines running one or more reducer processes that operate on each respective gradient produced by each worker process to produce an aggregated gradient. The one or more reducer processes may cause the aggregated gradient to be broadcasted to each worker process.

2.

发明公开
NEURAL NETWORK ARCHITECTURE SEARCH OVER COMPLEX BLOCK ARCHITECTURES 审中-公开

公开(公告)号：US20240112027A1

公开(公告)日：2024-04-04

申请号：US18477546

申请日：2023-09-28

Applicant: Google LLC

Inventor： Yanqi Zhou , Yanping Huang , Yifeng Lu , Andrew M. Dai , Siamak Shakeri , Zhifeng Chen , James Laudon , Quoc V. Le , Da Huang , Nan Du , David Richard So , Daiyi Peng , Yingwei Cui , Jeffrey Adgate Dean , Chang Lan

IPC: G06N3/08

CPC classification number: G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing neural architecture search for machine learning models. In one aspect, a method comprises receiving training data for a machine learning, generating a plurality of candidate neural networks for performing the machine learning task, wherein each candidate neural network comprises a plurality of instances of a layer block composed of a plurality of layers, for each candidate neural network, selecting a respective type for each of the plurality of layers from a set of layer types that comprises, training the candidate neural network and evaluating performance scores for the trained candidate neural networks as applied to the machine learning task, and determining a final neural network for performing the machine learning task based at least on the performance scores for the candidate neural networks.

3.

发明公开
STREAMING OF NATURAL LANGUAGE (NL) BASED OUTPUT GENERATED USING A LARGE LANGUAGE MODEL (LLM) TO REDUCE LATENCY IN RENDERING THEREOF 审中-公开

公开(公告)号：US20240311402A1

公开(公告)日：2024-09-19

申请号：US18136634

申请日：2023-04-19

Applicant: GOOGLE LLC

Inventor： Martin Baeuml , Yanping Huang , Wenhao Jia , Chang Lan , Yuanzhong Xu , Junwhan Ahn , Alexander Bailey , Leif Schelin , Trevor Strohman , Emanuel Taropa , Sidharth Mudgal , Yanyan Zheng , Zhifeng Chen , Ahmad Beirami

IPC: G06F16/332 , G06F40/40

CPC classification number: G06F16/3322 , G06F16/3329 , G06F40/40

Abstract: Implementations relate to reducing latency in generating and/or rendering natural language (NL) output generated using a large language model (LLM). Processor(s) of a system can: receive NL based input associated with a client device, and generate the NL based output utilizing the LLM. The NL based output can be a stream of NL based output in that it includes a plurality of segments, and is generated on a segment-by-segment basis. In some implementations, a first segment of the stream of NL based output is selected for inclusion in the stream of NL based output as a second segment (and any subsequent segment) is being generated to reduce latency in evaluating the NL based output as a whole prior to rendering thereof. In some versions of those implementations, the first segment is rendered as the second segment (and any subsequent segment) is being generated to further reduce latency in rendering thereof.

4.

发明公开
Machine Learning Inference Service Disaggregation 审中-公开

公开(公告)号：US20230409889A1

公开(公告)日：2023-12-21

申请号：US17842910

申请日：2022-06-17

Applicant: Google LLC

Inventor： Salem Elie Haykal , Arvind Krishnamurthy , Chang Lan , Soroush Radpour

IPC: G06N3/063 , G06N3/08 , G06N3/04

CPC classification number: G06N3/063 , G06N3/08 , G06N3/0472

Abstract: Aspects of the disclosure are directed to performing disaggregation-aware model graph partitioning, which can include provisioning and load balancing disaggregated resource pools, such as general purpose processors, accelerators, general purpose memory, and high bandwidth memory. Across these disaggregated resource pools, machine learning model operations can be packed and/or batched. The partitioning can further include automatically tuning runtime parameters.

5.

发明申请
Reduction Server for Fast Distributed Training 有权

公开(公告)号：US20220121465A1

公开(公告)日：2022-04-21

申请号：US17076393

申请日：2020-10-21

Applicant: Google LLC

Inventor： Chang Lan , Soroush Radpour

IPC: G06F9/455 , G06F9/48 , G06F9/38 , G06F9/54 , G06N3/08 , G06K9/62

Abstract: A data processing system, that includes: one or more host processing devices, the one or more host processing devices may be configured to support instantiation of a plurality of virtual machines such that a first set of virtual machines run one or more worker processes, each worker process operating on a respective data set to produce a respective gradient. The host processing devices may be configured to support instantiation of a second set of virtual machines running one or more reducer processes that operate on each respective gradient produced by each worker process to produce an aggregated gradient. The one or more reducer processes may cause the aggregated gradient to be broadcasted to each worker process.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification