Patent search ap:("Google LLC") AND inv:"Andreas Veit" Page 1

1.

发明授权
Fast adaptive optimization 有权

公开(公告)号：US12001509B2

公开(公告)日：2024-06-04

申请号：US16821509

申请日：2020-03-17

Applicant: Google LLC

Inventor： Seungyeon Kim , Jingzhao Zhang , Andreas Veit , Sanjiv Kumar , Sashank Reddi , Praneeth Karimireddy

IPC: G06N20/00 , G06F17/18 , G06F18/21 , G06N3/084

CPC classification number: G06F17/18 , G06F18/217 , G06N20/00 , G06N3/084

Abstract: Generally, the present disclosure is directed to systems and methods that perform adaptive optimization with improved convergence properties. The adaptive optimization techniques described herein are useful in various optimization scenarios, including, for example, training a machine-learned model such as, for example, a neural network. In particular, according to one aspect of the present disclosure, a system implementing the adaptive optimization technique can, over a plurality of iterations, employ an adaptive per coordinate clipping threshold to clip a current first moment of the coordinate to obtain a current update value that enables faster convergence for the machine-learned model when the noise in the stochastic gradients is heavy tailed.

2.

发明申请
Leveraging Redundancy in Attention with Reuse Transformers 有权

公开(公告)号：US20230112862A1

公开(公告)日：2023-04-13

申请号：US17960380

申请日：2022-10-05

Applicant: Google LLC

Inventor： Venkata S. Bhojanapalli , Andreas Veit , Ayan Chakrabarti , Frederick Liu , Himanshu Jain , Michal Lukasik , Sanjiv Kumar , Yin-Wen Chang

IPC: G06N3/04

Abstract: Provided are systems and methods that improve the computational efficiency of Transformers or other attention-based neural networks or machine learning models by re-using a number of attention scores between layers and/or heads of the model. To reduce the computational cost of self-attention-based models while achieving comparable or even superior results, example aspects of the present disclosure propose a novel architecture that reuses attention scores computed in one layer in one or multiple subsequent layers.

3.

发明申请
CROSS-EXAMPLE SOFTMAX AND/OR CROSS-EXAMPLE NEGATIVE MINING 有权

公开(公告)号：US20230111978A1

公开(公告)日：2023-04-13

申请号：US17910756

申请日：2020-03-18

Applicant: GOOGLE LLC

Inventor： Andreas Veit , Kimberly Wilber

IPC: G06N3/08 , G06F16/903

Abstract: Techniques are disclosed that enable learning an embedding space using cross-examples, where a distance between a query and an electronic resource in the embedding space provides an indication of the relevance of the electronic resource to the query. Various implementations include learning the embedding space using cross-example Softmax techniques. Various implementations include leaning the embedding space using cross-example negative mining. Additional or alternative techniques are disclosed that enable determining an electronic resource for a query based on comparing a query vector (e.g., a embedding space representation of the query) with a set of pre-stored candidate electronic resource vectors (e.g., an embedding space representation of a set of candidate electronic resources).

4.

发明申请
Fast Adaptive Optimization 有权

公开(公告)号：US20210295201A1

公开(公告)日：2021-09-23

申请号：US16821509

申请日：2020-03-17

Applicant: Google LLC

Inventor： Seungyeon Kim , Jingzhao Zhang , Andreas Veit , Sanjiv Kumar , Sashank Reddi , Praneeth Karimireddy

IPC: G06N20/00 , G06K9/62 , G06F17/18

Abstract: Generally, the present disclosure is directed to systems and methods that perform adaptive optimization with improved convergence properties. The adaptive optimization techniques described herein are useful in various optimization scenarios, including, for example, training a machine-learned model such as, for example, a neural network. In particular, according to one aspect of the present disclosure, a system implementing the adaptive optimization technique can, over a plurality of iterations, employ an adaptive per coordinate clipping threshold to clip a current first moment of the coordinate to obtain a current update value that enables faster convergence for the machine-learned model when the noise in the stochastic gradients is heavy tailed.

5.

发明申请
ACCOUNTING FOR LONG-TAIL TRAINING DATA THROUGH LOGIT ADJUSTMENT 有权

公开(公告)号：US20230017505A1

公开(公告)日：2023-01-19

申请号：US17375960

申请日：2021-07-14

Applicant: Google LLC

Inventor： Aditya Krishna Menon , Sanjiv Kumar , Himanshu Jain , Andreas Veit , Ankit Singh Rawat , Gayan Sadeep Jayasumana Hirimbura Matara Kankanamge

IPC: G06N3/08 , G06K9/62

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for accounting for long-tail training data.

Patent Agency Ranking