-
公开(公告)号:US11775823B2
公开(公告)日:2023-10-03
申请号:US17014139
申请日:2020-09-08
Applicant: Google LLC
Inventor: Sashank Jakkam Reddi , Sanjiv Kumar , Manzil Zaheer , Satyen Chandrakant Kale
Abstract: Generally, the present disclosure is directed to systems and methods that perform adaptive optimization with improved convergence properties. The adaptive optimization techniques described herein are useful in various optimization scenarios, including, for example, training a machine-learned model such as, for example, a neural network. In particular, according to one aspect of the present disclosure, a system implementing the adaptive optimization technique can, over a plurality of iterations, employ an adaptive effective learning rate while also ensuring that the effective learning rate is non-increasing.
-
公开(公告)号:US12093671B2
公开(公告)日:2024-09-17
申请号:US17731593
申请日:2022-04-28
Applicant: Google LLC
Inventor: Rishabh Singh , Bin Ni , Manzil Zaheer
CPC classification number: G06F8/51
Abstract: Techniques are described herein for translating source code using sparse-self attention. In various implementations, a source code snippet in a first programming language may be processed to obtain graph(s) representing snippet tokens, and relationships therebetween. Based on the graph(s), a subset of snippet token pairs may be identified from a superset of all possible token pairs in the source code snippet. Each token pair of the subset may include snippet tokens that are represented by nodes connected by one or more edges of the one or more graphs. A self-attention network of a translation machine learning model may be adapted to sparsely attend across the identified subset of token pairs. The source code snippet may then be processed based on the adapted translation machine learning model to generate a translation of the source code snippet in the second programming language.
-
公开(公告)号:US20240176604A1
公开(公告)日:2024-05-30
申请号:US18070015
申请日:2022-11-28
Applicant: Google LLC
Inventor: Joey Hong , Rishabh Singh , Joel Galenson , Jonathan Malmaud , Manzil Zaheer
IPC: G06F8/51
CPC classification number: G06F8/51
Abstract: Implementations are described herein for predicting symbolic transformation templates to automate source code transformations. In various implementations, pair(s) of predecessor and successor source code snippets may be processed using a symbolic transformation template prediction (STTP) model to predict a symbolic transformation template that includes a predecessor portion that matches the predecessor source code snippet(s) of the pair(s) and a successor portion that matches the successor source code snippet(s) of the pair(s). At least one additional predecessor source code snippet may be identified that matches the predecessor portion of the predicted symbolic transformation template. Placeholders of the predecessor portion of the predicted symbolic transformation template may be bound to one or more tokens of the at least one additional predecessor source code snippet to create binding(s). The successor portion of the predicted symbolic transformation template may be applied to the bindings to generate additional successor source code snippet(s).
-
公开(公告)号:US20220156553A1
公开(公告)日:2022-05-19
申请号:US17589542
申请日:2022-01-31
Applicant: Google LLC
Inventor: Joshua Timothy Ainslie , Santiago Ontañón , Philip Pham , Manzil Zaheer , Guru Guruganesh , Kumar Avinava Dubey , Amr Ahmed
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing network inputs using an attention neural network that has one or more sparse attention sub-layers. Each sparse attention sub-layer is configured to apply a sparse attention mechanism that attends differently for input positions that are in a first proper subset of the input positions in the input to the sub-layer than for positions that are not in the first proper subset.
-
公开(公告)号:US20210383191A1
公开(公告)日:2021-12-09
申请号:US17341193
申请日:2021-06-07
Applicant: Google LLC
Inventor: Joshua Timothy Ainslie , Santiago Ontañón , Philip Pham , Manzil Zaheer , Guru Guruganesh , Kumar Avinava Dubey , Amr Ahmed
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing network inputs using an attention neural network that has one or more sparse attention sub-layers. Each sparse attention sub-layer is configured to apply a sparse attention mechanism that attends differently for input positions that are in a first proper subset of the input positions in the input to the sub-layer than for positions that are not in the first proper subset.
-
公开(公告)号:US20210319339A1
公开(公告)日:2021-10-14
申请号:US17227817
申请日:2021-04-12
Applicant: Google LLC
Inventor: Ankit Singh Rawat , Manzil Zaheer , Aditya Krishna Menon , Sanjiv Kumar , Melanie Weber
Abstract: Generally, the present disclosure provides systems and methods for performing machine learning in hyperbolic space. Specifically, techniques are provided which enable the learning of a classifier (e.g., large-margin classifier) for data defined within a hyperbolic space (e.g., which may be particularly beneficial for data that possesses a hierarchical structure).
-
公开(公告)号:US12271810B2
公开(公告)日:2025-04-08
申请号:US17100253
申请日:2020-11-20
Applicant: Google LLC
Inventor: Sashank Jakkam Reddi , Sanjiv Kumar , Manzil Zaheer , Zachary Burr Charles , Zachary Alan Garrett , John Keith Rush , Jakub Konecny , Hugh Brendan McMahan
Abstract: A computing system and method can be used to implement a version of federated learning (FL) that incorporates adaptivity (e.g., leverages an adaptive learning rate). In particular, the present disclosure provides a general optimization framework in which (1) clients perform multiple epochs of training using a client optimizer to minimize loss on their local data and (2) a server system updates its global model by applying a gradient-based server optimizer to the average of the clients' model updates. This framework can seamlessly incorporate adaptivity by using adaptive optimizers as client and/or server optimizers. Building upon this general framework, the present disclosure also provides example specific adaptive optimization techniques for FL which use per-coordinate methods as server optimizers. By focusing on adaptive server optimization, the use of adaptive learning rates is enabled without increase in client storage or communication costs and compatibility with cross-device FL can be ensured.
-
公开(公告)号:US11960867B1
公开(公告)日:2024-04-16
申请号:US18198674
申请日:2023-05-17
Applicant: GOOGLE LLC
Inventor: Rishabh Singh , Hanjun Dai , Manzil Zaheer , Artem Goncharuk , Karen Davis , David Andre
CPC classification number: G06F8/436 , G06F40/279 , G06F40/40 , G06N3/08 , G06N7/01
Abstract: Using a natural language (NL) latent presentation in the automated conversion of source code from a base programming language (e.g., C++) to a target programming language (e.g., Python). A base-to-NL model can be used to generate an NL latent representation by processing a base source code snippet in the base programming language. Further, an NL-to-target model can be used to generate a target source code snippet in the target programming language (that is functionally equivalent to the base source code snippet), by processing the NL latent representation. In some implementations, output(s) from the NL-to-target model indicate canonical representation(s) of variables, and in generating the target source code snippet, technique(s) are used to match those canonical representation(s) to variable(s) of the base source code snippet. In some implementations, multiple candidate target source code snippets are generated, and a subset (e.g., one) is selected based on evaluation(s).
-
公开(公告)号:US20230394310A1
公开(公告)日:2023-12-07
申请号:US18453837
申请日:2023-08-22
Applicant: Google LLC
Inventor: Sashank Jakkam Reddi , Sanjiv Kumar , Manzil Zaheer , Satyen Chandrakant Kale
Abstract: Generally, the present disclosure is directed to systems and methods that perform adaptive optimization with improved convergence properties. The adaptive optimization techniques described herein are useful in various optimization scenarios, including, for example, training a machine-learned model such as, for example, a neural network. In particular, according to one aspect of the present disclosure, a system implementing the adaptive optimization technique can, over a plurality of iterations, employ an adaptive effective learning rate while also ensuring that the effective learning rate is non-increasing.
-
公开(公告)号:US20230350657A1
公开(公告)日:2023-11-02
申请号:US17731593
申请日:2022-04-28
Applicant: Google LLC
Inventor: Rishabh Singh , Bin Ni , Manzil Zaheer
IPC: G06F8/51
CPC classification number: G06F8/51
Abstract: Techniques are described herein for translating source code using sparse-self attention. In various implementations, a source code snippet in a first programming language may be processed to obtain graph(s) representing snippet tokens, and relationships therebetween. Based on the graph(s), a subset of snippet token pairs may be identified from a superset of all possible token pairs in the source code snippet. Each token pair of the subset may include snippet tokens that are represented by nodes connected by one or more edges of the one or more graphs. A self-attention network of a translation machine learning model may be adapted to sparsely attend across the identified subset of token pairs. The source code snippet may then be processed based on the adapted translation machine learning model to generate a translation of the source code snippet in the second programming language.
-
-
-
-
-
-
-
-
-