-
公开(公告)号:US11238332B2
公开(公告)日:2022-02-01
申请号:US17341193
申请日:2021-06-07
Applicant: Google LLC
Inventor: Joshua Timothy Ainslie , Santiago Ontañón , Philip Pham , Manzil Zaheer , Guru Guruganesh , Kumar Avinava Dubey , Amr Ahmed
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing network inputs using an attention neural network that has one or more sparse attention sub-layers. Each sparse attention sub-layer is configured to apply a sparse attention mechanism that attends differently for input positions that are in a first proper subset of the input positions in the input to the sub-layer than for positions that are not in the first proper subset.
-
公开(公告)号:US20250124264A1
公开(公告)日:2025-04-17
申请号:US18826393
申请日:2024-09-06
Applicant: Google LLC
Inventor: David M. Wang , Gaurav Gupta , Gokhan Mergen , Baixu Chen , Kumar Avinava Dubey , Amr Ahmed
IPC: G06N3/0475 , G06F40/166 , G06F40/284 , G06F40/30 , G06F40/40 , G06N3/096
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating descriptions of digital components. In one aspect, a method includes receiving data indicating a query received from a client device of a user. An initial digital component is obtained. Search history data that includes a set of related past queries received from the user is obtained. Updated text related to the first resource is generated by conditioning a language model with one or more contextual inputs that cause the language model to generate one or more outputs that include the updated text, the one or more contextual inputs characterizing one or more of the first query, data related to the initial digital component, the sequence of related past queries, or one or more tasks to be performed by the language model. An updated digital component that depicts the updated text is generated and provided.
-
公开(公告)号:US20240256865A1
公开(公告)日:2024-08-01
申请号:US18430586
申请日:2024-02-01
Applicant: Google LLC
Inventor: Deepali Jain , Krzysztof Marcin Choromanski , Sumeet Singh , Vikas Sindhwani , Tingnan Zhang , Jie Tan , Kumar Avinava Dubey
IPC: G06N3/08 , G06N3/0455
CPC classification number: G06N3/08 , G06N3/0455
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training neural networks. One of the methods for training a neural network configured to perform a machine learning task includes performing, at each of a plurality of iterations: performing a training step to obtain respective new gradients of a loss function; for each network parameter: generating an optimizer network input; processing the optimizer network input using an optimizer neural network, wherein the processing comprises, for each cell: generating a cell input for the cell; and processing the cell input for the cell to generate a cell output, wherein the processing comprises: obtaining latent embeddings from the cell input; generating the cell output from the hidden state; and determining an update to the hidden state; and generating an optimizer network output defining an update for the network parameter; and applying the update to the network parameter.
-
公开(公告)号:US20220156553A1
公开(公告)日:2022-05-19
申请号:US17589542
申请日:2022-01-31
Applicant: Google LLC
Inventor: Joshua Timothy Ainslie , Santiago Ontañón , Philip Pham , Manzil Zaheer , Guru Guruganesh , Kumar Avinava Dubey , Amr Ahmed
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing network inputs using an attention neural network that has one or more sparse attention sub-layers. Each sparse attention sub-layer is configured to apply a sparse attention mechanism that attends differently for input positions that are in a first proper subset of the input positions in the input to the sub-layer than for positions that are not in the first proper subset.
-
公开(公告)号:US20210383191A1
公开(公告)日:2021-12-09
申请号:US17341193
申请日:2021-06-07
Applicant: Google LLC
Inventor: Joshua Timothy Ainslie , Santiago Ontañón , Philip Pham , Manzil Zaheer , Guru Guruganesh , Kumar Avinava Dubey , Amr Ahmed
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing network inputs using an attention neural network that has one or more sparse attention sub-layers. Each sparse attention sub-layer is configured to apply a sparse attention mechanism that attends differently for input positions that are in a first proper subset of the input positions in the input to the sub-layer than for positions that are not in the first proper subset.
-
-
-
-