Patent search ap:("Google LLC") AND inv:"Dara Bahri" Page 1

1.

发明申请
Machine-Learned Attention Models Featuring Omnidirectional Processing 有权

公开(公告)号：US20220245428A1

公开(公告)日：2022-08-04

申请号：US17592796

申请日：2022-02-04

Applicant: Google LLC

Inventor： Yi Tay , Da-Cheng Juan , Dara Bahri , Donald Arthur Metzler, JR. , Jai Prakash Gupta , Mostafa Dehghani , Phillip Pham , Vamsi Krishna Aribandi , Zhen Qin

IPC: G06N3/04 , G06N3/10

Abstract: Provided are machine-learned attention models that feature omnidirectional processing, example implementations of which can be referred to as Omnidirectional Representations from Transformers (OMNINET). In example models described in the present disclosure, instead of maintaining a strictly horizontal receptive field, each token is allowed to attend to all tokens in some or all of the other tokens across the entire network.

2.

发明申请
Machine-Learned Attention Models Featuring Echo-Attention Layers 有权

公开(公告)号：US20220245432A1

公开(公告)日：2022-08-04

申请号：US17592174

申请日：2022-02-03

Applicant: Google LLC

Inventor： Yi Tay , Donald Arthur Metzler, JR. , Dara Bahri , Mostafa Dehghani

IPC: G06N3/04 , G06F40/20

Abstract: The present disclosure provides echo-attention layers, a new efficient method for increasing the expressiveness of self-attention layers without incurring significant parameter or training time costs. One intuition behind the proposed method is to learn to echo, i.e., attend once and then get N echo-ed attentions for free (or at a relatively cheap cost). As compared to stacking new layers, the proposed echoed attentions are targeted at providing similar representation power at a better cost efficiency.

3.

发明公开
EFFICIENT DECODING OF OUTPUT SEQUENCES USING ADAPTIVE EARLY EXITING 审中-公开

公开(公告)号：US20240169184A1

公开(公告)日：2024-05-23

申请号：US18426212

申请日：2024-01-29

Applicant: Google LLC

Inventor： Tal Schuster , Adam Joshua Fisch , Jai Prakash Gupta , Mostafa Dehghani , Dara Bahri , Vinh Quoc Tran , Yi Tay , Donald Arthur Metzler, JR.

IPC: G06N3/0455

CPC classification number: G06N3/0455

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating output sequences using auto-regressive decoder neural networks. In particular, during generation, adaptive early exiting is used to reduce the time required to generate the output sequence.

4.

发明公开
EFFICIENT DECODING OF OUTPUT SEQUENCES USING ADAPTIVE EARLY EXITING 审中-公开

公开(公告)号：US20240020516A1

公开(公告)日：2024-01-18

申请号：US18222395

申请日：2023-07-14

Applicant: Google LLC

Inventor： Tal Schuster , Adam Joshua Fisch , Jai Prakash Gupta , Mostafa Dehghani , Dara Bahri , Vinh Quoc Tran , Yi Tay , Donald Arthur Metzler, Jr.

IPC: G06N3/0455

CPC classification number: G06N3/0455

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating output sequences using auto-regressive decoder neural networks. In particular, during generation, adaptive early exiting is used to reduce the time required to generate the output sequence.

5.

发明授权
Efficient decoding of output sequences using adaptive early exiting 有权

公开(公告)号：US11886976B1

公开(公告)日：2024-01-30

申请号：US18222395

申请日：2023-07-14

Applicant: Google LLC

Inventor： Tal Schuster , Adam Joshua Fisch , Jai Prakash Gupta , Mostafa Dehghani , Dara Bahri , Vinh Quoc Tran , Yi Tay , Donald Arthur Metzler, Jr.

IPC: G06N3/0455

CPC classification number: G06N3/0455

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating output sequences using auto-regressive decoder neural networks. In particular, during generation, adaptive early exiting is used to reduce the time required to generate the output sequence.

6.

发明申请
SORTING ATTENTION NEURAL NETWORKS 有权

公开(公告)号：US20210248450A1

公开(公告)日：2021-08-12

申请号：US17169718

申请日：2021-02-08

Applicant: Google LLC

Inventor： Yi Tay , Liu Yang , Donald Arthur Metzler, JR. , Dara Bahri , Da-Cheng Juan

IPC: G06N3/04 , G06N20/00 , G06F17/16

Abstract: A system for performing a machine learning task on a network input is described. The system includes one or more computers and one or more storage devices storing instructions that, when executed by the one or more computers, cause the one or more computers to implement (i) multiple sorting networks in which each sorting network is configured to sort vector blocks in a sequence of vector blocks to generate a sorted sequence of vector blocks; and (ii) a sorting attention neural network configured to perform the machine learning task on the input sequence by executing multiple sorting attention mechanisms using the sorting networks.

7.

发明公开
CHARACTER-LEVEL ATTENTION NEURAL NETWORKS 审中-公开

公开(公告)号：US20240289552A1

公开(公告)日：2024-08-29

申请号：US18564859

申请日：2022-05-27

Applicant: Google LLC

Inventor： Yi Tay , Dara Bahri , Donald Arthur Metzler, Jr. , Hyung Won Chung , Jai Prakash Gupta , Sebastian Nikolas Ruder , Simon Baumgartner , Vinh Quoc Tran , Zhen Qin

IPC: G06F40/284

CPC classification number: G06F40/284

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing a machine learning task on an input sequence of characters that has a respective character at each of a plurality of character positions to generate a network output. One of the systems includes a neural network configured to perform the machine learning task, the neural network comprising a gradient-based sub-word tokenizer and an output neural network. The gradient-based sub-word tokenizer is configured to apply a learned, i.e., flexible, sub-word tokenization strategy to the input sequence of characters to generate a sequence of latent sub-word representations. The output neural network is configured to process the latent sub-word representation to generate the network output for the task.

8.

发明申请
SELF-SUPERVISED CONTRASTIVE LEARNING USING RANDOM FEATURE CORRUPTION 有权

公开(公告)号：US20220383120A1

公开(公告)日：2022-12-01

申请号：US17827448

申请日：2022-05-27

Applicant: Google LLC

Inventor： Dara Bahri , Donald Arthur Metzler, JR. , Hanxi Heinrich Jiang , Yi Tay

IPC: G06N3/08 , G06K9/62

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network having a plurality of network parameters. One of the methods includes obtaining an unlabeled training input from a set of unlabeled training data; processing the unlabeled training input to generate a first embedding; generating a corrupted version of the unlabeled training input, comprising determining a proper subset of the feature dimensions and, for each feature dimension that is in the proper subset of feature dimensions, applying a corruption to the respective feature in the feature dimension using one or more feature values sampled from a marginal distribution of the feature dimension as specified in the set of unlabeled training data; processing the corrupted version of the unlabeled training input to generate a second embedding; and determining an update to the current values of the plurality of network parameters.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification