Patent search ap:("Google LLC") AND inv:"Joshua Timothy Ainslie" Page 1

1.

发明申请
RELATIVE POSITION BIASES IN ATTENTION NEURAL NETWORKS USING FUNCTIONAL INTERPOLATION 有权

公开(公告)号：US20250111210A1

公开(公告)日：2025-04-03

申请号：US18900531

申请日：2024-09-27

Applicant: Google LLC

Inventor： Chong You , Guru Guruganesh , Joshua Timothy Ainslie , Manzil Zaheer , Sanjiv Kumar , Santiago Ontañón , Shanda Li , Venkata Sesha Pavana Srinadh Bhojanapalli , Sumit Sanghai

IPC: G06N3/0475

Abstract: Systems and methods for processing inputs using attention neural networks. In particular, one or more of the attention layers within the attention neural network compute relative position biases using functional interpolation.

2.

发明申请
Sparse Mixer Architecture 有权

公开(公告)号：US20240386256A1

公开(公告)日：2024-11-21

申请号：US18318049

申请日：2023-05-16

Applicant: Google LLC

Inventor： James Lee Thorp , Joshua Timothy Ainslie

IPC: G06N3/0499

Abstract: Improved multi-layer machine learning model architectures are provided that exhibit increased accuracy, decreased training time, decreased inference compute cost, and/or increased stability while training. These improved models include a plurality of sequential layers, each layer comprising a mixing layer that feeds into a feedforward layer. These improved models achieve these benefits by ‘enhancing’ a subset of the feedforward layers with mixture-of-experts or other sparse multi-network architectures while ‘degrading’ a subset of the mixing layers to be simple linear mixing layers (e.g., that multiply inputs by one or more mixing matrices) rather than more complicated attentional mixing mechanisms (e.g., including a number of matrix multiplications, dot products, and nonlinear operations). Such a combination of mixing layer modifications and feedforward layer modifications in a single multi-layer model exhibits synergistic improvements with respect to training time, inference computational cost, and training stability for a given level of model accuracy.

3.

发明授权
Attention neural networks with sparse attention mechanisms 有权

公开(公告)号：US11238332B2

公开(公告)日：2022-02-01

申请号：US17341193

申请日：2021-06-07

Applicant: Google LLC

Inventor： Joshua Timothy Ainslie , Santiago Ontañón , Philip Pham , Manzil Zaheer , Guru Guruganesh , Kumar Avinava Dubey , Amr Ahmed

IPC: G06N3/04 , G06N3/08 , G06N3/063 , G06N20/00

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing network inputs using an attention neural network that has one or more sparse attention sub-layers. Each sparse attention sub-layer is configured to apply a sparse attention mechanism that attends differently for input positions that are in a first proper subset of the input positions in the input to the sub-layer than for positions that are not in the first proper subset.

4.

发明申请
MIXING TOKENS WITH SPECTRAL TRANSFORM 有权

公开(公告)号：US20230077928A1

公开(公告)日：2023-03-16

申请号：US17474928

申请日：2021-09-14

Applicant: Google LLC

Inventor： James Patrick Lee-Thorp , Joshua Timothy Ainslie , Ilya Eckstein , Santiago Ontañón

IPC: G06N20/00 , G06F17/14

Abstract: Transformer systems and methods of using such transformer systems including computer programs encoded on a computer storage medium, for performing a deep learning task on an input sequence to generate an encoded output. In one aspect, one of the transformer systems includes an encoder architecture block, comprising: a spectral transform mixing layer that receives input embeddings of input tokens and generates, as output, a spectral transform output along a sequence dimension of the input embeddings; and a feed forward layer that receives an input based on the input embeddings of input tokens and the spectral transform output and generates an output for a subsequent processing block.

5.

发明申请
ATTENTION NEURAL NETWORKS WITH SPARSE ATTENTION MECHANISMS 有权

公开(公告)号：US20220156553A1

公开(公告)日：2022-05-19

申请号：US17589542

申请日：2022-01-31

Applicant: Google LLC

Inventor： Joshua Timothy Ainslie , Santiago Ontañón , Philip Pham , Manzil Zaheer , Guru Guruganesh , Kumar Avinava Dubey , Amr Ahmed

IPC: G06N3/04 , G06N20/00 , G06N3/063 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing network inputs using an attention neural network that has one or more sparse attention sub-layers. Each sparse attention sub-layer is configured to apply a sparse attention mechanism that attends differently for input positions that are in a first proper subset of the input positions in the input to the sub-layer than for positions that are not in the first proper subset.

6.

发明申请
ATTENTION NEURAL NETWORKS WITH SPARSE ATTENTION MECHANISMS 有权

公开(公告)号：US20210383191A1

公开(公告)日：2021-12-09

申请号：US17341193

申请日：2021-06-07

Applicant: Google LLC

Inventor： Joshua Timothy Ainslie , Santiago Ontañón , Philip Pham , Manzil Zaheer , Guru Guruganesh , Kumar Avinava Dubey , Amr Ahmed

IPC: G06N3/04 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing network inputs using an attention neural network that has one or more sparse attention sub-layers. Each sparse attention sub-layer is configured to apply a sparse attention mechanism that attends differently for input positions that are in a first proper subset of the input positions in the input to the sub-layer than for positions that are not in the first proper subset.

Patent Agency Ranking