-
公开(公告)号:US20230154161A1
公开(公告)日:2023-05-18
申请号:US17988655
申请日:2022-11-16
Applicant: Google LLC
Inventor: Hieu Hy Pham , Zihang Dai , Golnaz Ghiasi , Hanxiao Liu , Wei Yu , Mingxing Tan , Quoc V. Le
IPC: G06V10/774 , G06V10/776 , G06F40/126 , G06V10/82 , G06T9/00 , G06V10/764
CPC classification number: G06V10/774 , G06V10/776 , G06F40/126 , G06V10/82 , G06T9/002 , G06V10/764
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using memory-optimized contrastive learning to train image encoder and text encoder neural networks.
-
公开(公告)号:US20220383069A1
公开(公告)日:2022-12-01
申请号:US17827130
申请日:2022-05-27
Applicant: Google LLC
Inventor: Zihang Dai , Hanxiao Liu , Mingxing Tan , Quoc V. Le
Abstract: A computer-implemented method for performing computer vision with reduced computational cost and improved accuracy can include obtaining, by a computing system including one or more computing devices, input data comprising an input tensor having one or more dimensions, providing, by the computing system, the input data to a machine-learned convolutional attention network, the machine-learned convolutional attention network including two or more network stages, and, in response to providing the input data to the machine-learned convolutional attention network, receiving, by the computing system, a machine-learning prediction from the machine-learned convolutional attention network. The convolutional attention network can include at least one attention block, wherein the attention block includes a relative attention mechanism, the relative attention mechanism including the sum of a static convolution kernel with an adaptive attention matrix. This provides for improved generalization, capacity, and efficiency of the convolutional attention network relative to some existing models.
-
公开(公告)号:US20220215209A1
公开(公告)日:2022-07-07
申请号:US17606190
申请日:2020-04-24
Applicant: Google LLC
Inventor: Thang Minh Luong , Quoc V. Le , Qizhe Xie , Zihang Dai
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a IT machine learning model. One of the methods includes receiving training data comprising a plurality of unlabeled training inputs and a plurality of labeled training inputs; generating augmented training data, comprising generating, for each of the plurality of unlabeled training inputs, a respective augmented training input by applying a data augmentation technique to the unlabeled training input; and training the machine learning model on the augmented training data. In particular, but not exclusively, the model may be trained for perceptual tasks (e.g. tasks relating to vision or speech).
-
公开(公告)号:US20240428071A1
公开(公告)日:2024-12-26
申请号:US18823611
申请日:2024-09-03
Applicant: Google LLC
Inventor: David Richard So , Quoc V. Le , Hanxiao Liu , Wojciech Andrzej Manke , Zihang Dai , Noam M. Shazeer
IPC: G06N3/08
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing a machine learning task on a network input to generate a network output. One of the systems includes an attention neural network configured to perform the machine learning task. The attention neural network includes one or more attentions layers that each include a squared ReLU activation layer, a depth-wise convolution layer, or both.
-
公开(公告)号:US11755883B2
公开(公告)日:2023-09-12
申请号:US17827130
申请日:2022-05-27
Applicant: Google LLC
Inventor: Zihang Dai , Hanxiao Liu , Mingxing Tan , Quoc V. Le
Abstract: A computer-implemented method for performing computer vision with reduced computational cost and improved accuracy can include obtaining, by a computing system including one or more computing devices, input data comprising an input tensor having one or more dimensions, providing, by the computing system, the input data to a machine-learned convolutional attention network, the machine-learned convolutional attention network including two or more network stages, and, in response to providing the input data to the machine-learned convolutional attention network, receiving, by the computing system, a machine-learning prediction from the machine-learned convolutional attention network. The convolutional attention network can include at least one attention block, wherein the attention block includes a relative attention mechanism, the relative attention mechanism including the sum of a static convolution kernel with an adaptive attention matrix. This provides for improved generalization, capacity, and efficiency of the convolutional attention network relative to some existing models.
-
公开(公告)号:US20250139431A1
公开(公告)日:2025-05-01
申请号:US18834202
申请日:2023-01-30
Applicant: Google LLC
Inventor: Hanxiao Liu , Weizhe Hua , Zihang Dai , Quoc V. Le
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing a machine learning task on a network input to generate a network output. In one aspect, one of the systems includes a neural network configured to perform the machine learning task, the neural network including one or more attentive layers that each include a gated attention unit.
-
公开(公告)号:US12118064B2
公开(公告)日:2024-10-15
申请号:US17606190
申请日:2020-04-24
Applicant: Google LLC
Inventor: Thang Minh Luong , Quoc V. Le , Qizhe Xie , Zihang Dai
IPC: G06F18/21 , G06F18/214 , G06N3/08
CPC classification number: G06F18/217 , G06F18/2148 , G06N3/08
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a IT machine learning model. One of the methods includes receiving training data comprising a plurality of unlabeled training inputs and a plurality of labeled training inputs; generating augmented training data, comprising generating, for each of the plurality of unlabeled training inputs, a respective augmented training input by applying a data augmentation technique to the unlabeled training input; and training the machine learning model on the augmented training data. In particular, but not exclusively, the model may be trained for perceptual tasks (e.g. tasks relating to vision or speech).
-
公开(公告)号:US20220367052A1
公开(公告)日:2022-11-17
申请号:US17745715
申请日:2022-05-16
Applicant: Google LLC
Inventor: Hanxiao Liu , David Richard So , Quoc V. Le , Zihang Dai
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing a machine learning task on a network input to generate a network output. In one aspect, one of the systems includes a neural network configured to perform the machine learning task, the neural network including one or more blocks that each include a feedforward spatial transformation unit.
-
公开(公告)号:US20230359862A1
公开(公告)日:2023-11-09
申请号:US18355243
申请日:2023-07-19
Applicant: Google LLC
Inventor: Zihang Dai , Mingxing Tan , Quoc V. Le , Hanxiao Liu
Abstract: A computer-implemented method for performing computer vision with reduced computational cost and improved accuracy can include obtaining, by a computing system including one or more computing devices, input data comprising an input tensor having one or more dimensions, providing, by the computing system, the input data to a machine-learned convolutional attention network, the machine-learned convolutional attention network including two or more network stages, and, in response to providing the input data to the machine-learned convolutional attention network, receiving, by the computing system, a machine-learning prediction from the machine-learned convolutional attention network. The convolutional attention network can include at least one attention block, wherein the attention block includes a relative attention mechanism, the relative attention mechanism including the sum of a static convolution kernel with an adaptive attention matrix. This provides for improved generalization, capacity, and efficiency of the convolutional attention network relative to some existing models.
-
公开(公告)号:US20230281400A1
公开(公告)日:2023-09-07
申请号:US17685774
申请日:2022-03-03
Applicant: Google LLC
Inventor: Zirui Wang , Jiahui Yu , Yuan Cao , Wei Yu , Zihang Dai
IPC: G06F40/58 , G06F40/284 , G06V30/10 , G06V10/766
CPC classification number: G06F40/58 , G06F40/284 , G06V10/766 , G06V30/10
Abstract: Example embodiments of the present disclosure relate to systems and methods for pretraining image-processing models on weakly-supervised image-text pairs. The pretraining can include receiving a training sequence for the machine-learned image-processing model. The training sequence can include text tokens and image tokens. A prefix sequence can contain the image tokens. A remainder sequence can include a remainder set of the text tokens. The pretraining can include determining, using the prefix sequence as an input to the machine-learned image-processing model, an objective based on recovery of the remainder sequence. The pretraining can include updating one or more learnable parameters of the machine-learned image-processing model based on the objective.
-
-
-
-
-
-
-
-
-