-
公开(公告)号:US10733016B1
公开(公告)日:2020-08-04
申请号:US16395697
申请日:2019-04-26
申请人: Google LLC
摘要: Methods, systems, and apparatus for scheduling first-in-first-out instructions are described. In one aspect, a method includes receiving data representing code of a program to be executed by a processing unit comprising hardware processors. For each of one or more of the hardware processors, an order of independent groups of first-in-first-out (FIFO) instructions for execution by the hardware processor is identified in the data representing the code of the program. For each independent group of FIFO instructions for execution by the hardware processor, a path length metric that represents how long it will take to reach an end of the program from the independent group of FIFO instructions is determined. A new order of the independent groups of FIFO instructions for execution by the hardware processor is generated based at least on the path length metric for each independent group of FIFO instructions for execution by the hardware processor.
-
2.
公开(公告)号:US20240303464A1
公开(公告)日:2024-09-12
申请号:US18598876
申请日:2024-03-07
申请人: Google LLC
发明人: Nan Du , Tao Wang , Yanqi Zhou , Tao Lei , Yuanzhong Xu , Andrew Mingbo Dai , Zhifeng Chen , Dewen Zeng , Yingwei Cui
摘要: A method includes providing a first set of data objects to a first skip router of a neural network (NN). The NN includes a first NN layer and a second NN layer. The first set of data objects is subdivided into a first set of skip objects and a first set of non-skip objects based on a first skip logic implemented by the first skip router and a first context of each data object in the first set of data objects. A first set of processed objects is generated based on the first set of non-skip objects and a first layer logic implemented by the first NN layer. Predictions are generated based on a second set of data objects and a second layer logic implemented by the second NN layer. The second set of data objects includes the first set of processed objects and the first set of skip objects.
-
公开(公告)号:US11972238B2
公开(公告)日:2024-04-30
申请号:US17838710
申请日:2022-06-13
申请人: Google LLC
发明人: Yuanzhong Xu
IPC分类号: G06F8/41 , G06F16/901 , G06N20/00
CPC分类号: G06F8/443 , G06F16/9024 , G06N20/00
摘要: Methods, systems, and apparatus for propagating reduced-precision on computation graphs are described. In one aspect, a method includes receiving data specifying a directed graph that includes operators for a program. The operators include first operators that each represent a numerical operation performed on numerical values having a first level of precision and second operators that each represent a numerical operation performed on numerical values having a second level of precision. One or more downstream operators are identified for a first operator. A determination is made whether each downstream operator represents a numerical operation that is performed on input values having the second level of precision. Whenever each downstream operator represents a numerical operation that is performed on input values having the second level of precision, a precision of numerical values output by the operation represented by the first operator is adjusted to the second level of precision.
-
公开(公告)号:US11221879B2
公开(公告)日:2022-01-11
申请号:US16919968
申请日:2020-07-02
申请人: Google LLC
摘要: Methods, systems, and apparatus for scheduling first-in-first-out instructions are described. In one aspect, a method includes receiving data representing code of a program to be executed by a processing unit comprising hardware processors. For each of one or more of the hardware processors, an order of independent groups of first-in-first-out (FIFO) instructions for execution by the hardware processor is identified in the data representing the code of the program. For each independent group of FIFO instructions for execution by the hardware processor, a path length metric that represents how long it will take to reach an end of the program from the independent group of FIFO instructions is determined. A new order of the independent groups of FIFO instructions for execution by the hardware processor is generated based at least on the path length metric for each independent group of FIFO instructions for execution by the hardware processor.
-
公开(公告)号:US20200249924A1
公开(公告)日:2020-08-06
申请号:US16263730
申请日:2019-01-31
申请人: Google LLC
发明人: Yuanzhong Xu
IPC分类号: G06F8/41 , G06N20/00 , G06F16/901
摘要: Methods, systems, and apparatus for propagating reduced-precision on computation graphs are described. In one aspect, a method includes receiving data specifying a directed graph that includes operators for a program. The operators include first operators that each represent a numerical operation performed on numerical values having a first level of precision and second operators that each represent a numerical operation performed on numerical values having a second level of precision. One or more downstream operators are identified for a first operator. A determination is made whether each downstream operator represents a numerical operation that is performed on input values having the second level of precision. Whenever each downstream operator represents a numerical operation that is performed on input values having the second level of precision, a precision of numerical values output by the operation represented by the first operator is adjusted to the second level of precision.
-
公开(公告)号:US20240112088A1
公开(公告)日:2024-04-04
申请号:US18520083
申请日:2023-11-27
申请人: Google LLC
发明人: Jiahui Yu , Xin Li , Han Zhang , Vijay Vasudevan , Alexander Yeong-Shiuh Ku , Jason Michael Baldridge , Yuanzhong Xu , Jing Yu Koh , Thang Minh Luong , Gunjan Baid , Zirui Wang , Yonghui Wu
IPC分类号: G06N20/00
CPC分类号: G06N20/00
摘要: Systems and methods are provided for vector-quantized image modeling using vision transformers and improved codebook handling. In particular, the present disclosure provides a Vector-quantized Image Modeling (VIM) approach that involves pretraining a machine learning model (e.g., Transformer model) to predict rasterized image tokens autoregressively. The discrete image tokens can be encoded from a learned Vision-Transformer-based VQGAN (example implementations of which can be referred to as ViT-VQGAN). The present disclosure proposes multiple improvements over vanilla VQGAN from architecture to codebook learning, yielding better efficiency and reconstruction fidelity. The improved ViT-VQGAN further improves vector-quantized image modeling tasks, including unconditional image generation, conditioned image generation (e.g., class-conditioned image generation), and unsupervised representation learning.
-
公开(公告)号:US20230222318A1
公开(公告)日:2023-07-13
申请号:US18009841
申请日:2021-06-30
申请人: Google LLC
发明人: Dmitry Lepikhin , Yanping Huang , Orhan Firat , Maxim Krikun , Dehao Chen , Noam M. Shazeer , HyoukJoong Lee , Yuanzhong Xu , Zhifeng Chen
摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing machine learning task on a network input to generate a network output. In one aspect, one of the systems includes an attention neural network configured to perform the machine learning task, the attention neural network including one or more attention layers, each attention layer comprising an attention sub-layer and a feed-forward sub-layer. Some or all of the attention layers have a feed-forward sub-layer that applies conditional computation to the inputs to the sub-layer.
-
公开(公告)号:US11385875B2
公开(公告)日:2022-07-12
申请号:US16263730
申请日:2019-01-31
申请人: Google LLC
发明人: Yuanzhong Xu
IPC分类号: G06F8/41 , G06F16/901 , G06N20/00
摘要: Methods, systems, and apparatus for propagating reduced-precision on computation graphs are described. In one aspect, a method includes receiving data specifying a directed graph that includes operators for a program. The operators include first operators that each represent a numerical operation performed on numerical values having a first level of precision and second operators that each represent a numerical operation performed on numerical values having a second level of precision. One or more downstream operators are identified for a first operator. A determination is made whether each downstream operator represents a numerical operation that is performed on input values having the second level of precision. Whenever each downstream operator represents a numerical operation that is performed on input values having the second level of precision, a precision of numerical values output by the operation represented by the first operator is adjusted to the second level of precision.
-
公开(公告)号:US20200341807A1
公开(公告)日:2020-10-29
申请号:US16919968
申请日:2020-07-02
申请人: Google LLC
摘要: Methods, systems, and apparatus for scheduling first-in-first-out instructions are described. In one aspect, a method includes receiving data representing code of a program to be executed by a processing unit comprising hardware processors. For each of one or more of the hardware processors, an order of independent groups of first-in-first-out (FIFO) instructions for execution by the hardware processor is identified in the data representing the code of the program. For each independent group of FIFO instructions for execution by the hardware processor, a path length metric that represents how long it will take to reach an end of the program from the independent group of FIFO instructions is determined. A new order of the independent groups of FIFO instructions for execution by the hardware processor is generated based at least on the path length metric for each independent group of FIFO instructions for execution by the hardware processor.
-
10.
公开(公告)号:US20240311402A1
公开(公告)日:2024-09-19
申请号:US18136634
申请日:2023-04-19
申请人: GOOGLE LLC
发明人: Martin Baeuml , Yanping Huang , Wenhao Jia , Chang Lan , Yuanzhong Xu , Junwhan Ahn , Alexander Bailey , Leif Schelin , Trevor Strohman , Emanuel Taropa , Sidharth Mudgal , Yanyan Zheng , Zhifeng Chen , Ahmad Beirami
IPC分类号: G06F16/332 , G06F40/40
CPC分类号: G06F16/3322 , G06F16/3329 , G06F40/40
摘要: Implementations relate to reducing latency in generating and/or rendering natural language (NL) output generated using a large language model (LLM). Processor(s) of a system can: receive NL based input associated with a client device, and generate the NL based output utilizing the LLM. The NL based output can be a stream of NL based output in that it includes a plurality of segments, and is generated on a segment-by-segment basis. In some implementations, a first segment of the stream of NL based output is selected for inclusion in the stream of NL based output as a second segment (and any subsequent segment) is being generated to reduce latency in evaluating the NL based output as a whole prior to rendering thereof. In some versions of those implementations, the first segment is rendered as the second segment (and any subsequent segment) is being generated to further reduce latency in rendering thereof.
-
-
-
-
-
-
-
-
-