Patent search ap:("DeepMind Technologies Limited") AND inv:"Siddhant Madhu Jayakumar" Page 1

1.

发明公开
GATED ATTENTION NEURAL NETWORKS 审中-公开

公开(公告)号：US20240320469A1

公开(公告)日：2024-09-26

申请号：US18679200

申请日：2024-05-30

Applicant: DeepMind Technologies Limited

Inventor： Emilio Parisotto , Hasuk Song , Jack William Rae , Siddhant Madhu Jayakumar , Maxwell Elliot Jaderberg , Razvan Pascanu , Caglar Gulcehre

IPC: G06N3/044 , G06N3/048 , G06N3/08

CPC classification number: G06N3/044 , G06N3/048 , G06N3/08

Abstract: A system including an attention neural network that is configured to receive an input sequence and to process the input sequence to generate an output is described. The attention neural network includes: an attention block configured to receive a query input, a key input, and a value input that are derived from an attention block input. The attention block includes an attention neural network layer configured to: receive an attention layer input derived from the query input, the key input, and the value input, and apply an attention mechanism to the query input, the key input, and the value input to generate an attention layer output for the attention neural network layer; and a gating neural network layer configured to apply a gating mechanism to the attention block input and the attention layer output of the attention neural network layer to generate a gated attention output.

2.

发明授权
Gated attention neural networks 有权

公开(公告)号：US12033055B2

公开(公告)日：2024-07-09

申请号：US17763984

申请日：2020-09-07

Applicant: DeepMind Technologies Limited

Inventor： Emilio Parisotto , Hasuk Song , Jack William Rae , Siddhant Madhu Jayakumar , Maxwell Elliot Jaderberg , Razvan Pascanu , Caglar Gulcehre

IPC: G06N3/044 , G06N3/048 , G06N3/08

CPC classification number: G06N3/044 , G06N3/048 , G06N3/08

Abstract: A system including an attention neural network that is configured to receive an input sequence and to process the input sequence to generate an output is described. The attention neural network includes: an attention block configured to receive a query input, a key input, and a value input that are derived from an attention block input. The attention block includes an attention neural network layer configured to: receive an attention layer input derived from the query input, the key input, and the value input, and apply an attention mechanism to the query input, the key input, and the value input to generate an attention layer output for the attention neural network layer; and a gating neural network layer configured to apply a gating mechanism to the attention block input and the attention layer output of the attention neural network layer to generate a gated attention output.

3.

发明申请
GATED ATTENTION NEURAL NETWORKS 有权

公开(公告)号：US20220366218A1

公开(公告)日：2022-11-17

申请号：US17763984

申请日：2020-09-07

Applicant: DeepMind Technologies Limited

Inventor： Emilio Parisotto , Hasuk Song , Jack William Rae , Siddhant Madhu Jayakumar , Maxwell Elliot Jaderberg , Razvan Pascanu , Caglar Gulcehre

IPC: G06N3/04 , G06N3/08

Abstract: A system including an attention neural network that is configured to receive an input sequence and to process the input sequence to generate an output is described. The attention neural network includes: an attention block configured to receive a query input, a key input, and a value input that are derived from an attention block input. The attention block includes an attention neural network layer configured to: receive an attention layer input derived from the query input, the key input, and the value input, and apply an attention mechanism to the query input, the key input, and the value input to generate an attention layer output for the attention neural network layer; and a gating neural network layer configured to apply a gating mechanism to the attention block input and the attention layer output of the attention neural network layer to generate a gated attention output.

4.

发明申请
SYSTEM AND METHOD FOR TRAINING A SPARSE NEURAL NETWORK WHILST MAINTAINING SPARSITY 有权

公开(公告)号：US20230124177A1

公开(公告)日：2023-04-20

申请号：US17914035

申请日：2021-06-04

Applicant: DeepMind Technologies Limited

Inventor： Siddhant Madhu Jayakumar , Razvan Pascanu , Jack William Rae , Simon Osindero , Erich Konrad Elsen

IPC: G06N3/08 , G06F18/211

Abstract: A computer-implemented method of training a neural network. The method comprises repeatedly determining a forward-pass set of network parameters by selecting a first sub-set of parameters of the neural network and setting all other parameters of the forward-pass set of network parameters to zero. The method then processes a training data item using the neural network in accordance with the forward-pass set of network parameters to generate a neural network output, determines a value of an objective function from the neural network output and the training data item, selects a second sub-set of network parameters, determines a backward-pass set of network parameters comprising the first and second sub-sets of parameters, and updates parameters corresponding to the backward-pass set of network parameters using a gradient estimate determined from the value of the objective function.

Patent Agency Ranking