-
公开(公告)号:US20230154161A1
公开(公告)日:2023-05-18
申请号:US17988655
申请日:2022-11-16
Applicant: Google LLC
Inventor: Hieu Hy Pham , Zihang Dai , Golnaz Ghiasi , Hanxiao Liu , Wei Yu , Mingxing Tan , Quoc V. Le
IPC: G06V10/774 , G06V10/776 , G06F40/126 , G06V10/82 , G06T9/00 , G06V10/764
CPC classification number: G06V10/774 , G06V10/776 , G06F40/126 , G06V10/82 , G06T9/002 , G06V10/764
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using memory-optimized contrastive learning to train image encoder and text encoder neural networks.
-
公开(公告)号:US11048875B2
公开(公告)日:2021-06-29
申请号:US16865747
申请日:2020-05-04
Applicant: Google LLC
Inventor: Quoc V. Le , Hongrae Lee , Wei Yu
IPC: G06F40/30 , G06F40/284 , G06N3/04 , G06N3/08 , G06F40/289
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing sequential data. In one aspect, a computer-implemented method includes receiving a request to generate a system output for an input data sequence, the input data sequence including a plurality of tokens. One or more tokens may be designated as tokens to be skipped. When a token has not been designated as a token to be skipped, the token is processed using a recurrent neural network to update a current internal state of the recurrent neural network. The system output is generated from the final internal state of the recurrent neural network.
-
公开(公告)号:US20250139379A1
公开(公告)日:2025-05-01
申请号:US18385270
申请日:2023-10-30
Applicant: GOOGLE LLC
Inventor: Sanil Jain , Wei Yu , Alessandro Agostini , Agoston Weisz , Michael Andrew Goodman , Attila Dankovics , Elle Chae , Evgeny Sluzhaev , Amin Ghafouri , Golnaz Ghiasi , Igor Petrovski , Konstantin Shagin , Marcelo Menegali , Oscar Akerlund , Rakesh Shivanna , Thang Luong , Tiffany Chen , Vikas Peswani , Yifeng Lu
IPC: G06F40/40 , G06F16/483
Abstract: Implementations relate to generating multi-modal response(s) through utilization of large language model(s) (LLM(s)) and other generative model(s). Processor(s) of a system can: receive natural language (NL) based input, generate a multi-modal response that is responsive to the NL based output, and cause the multi-modal response to be rendered. In some implementations, and in generating the multi-modal response, the processor(s) can process, using a LLM, LLM input to generate LLM output, and determine, based on the LLM output, textual content and generative multimedia content for inclusion in the multi-modal response. In some implementations, the generative multimedia content can be generated by another generative model (e.g., an image generator, a video generator, an audio generator, etc.) based on generative multimedia content prompt(s) included in the LLM output and that is indicative of the generative multimedia content. In various implementations, the generative multimedia content can be interleaved between segments of the textual content.
-
公开(公告)号:US20230281400A1
公开(公告)日:2023-09-07
申请号:US17685774
申请日:2022-03-03
Applicant: Google LLC
Inventor: Zirui Wang , Jiahui Yu , Yuan Cao , Wei Yu , Zihang Dai
IPC: G06F40/58 , G06F40/284 , G06V30/10 , G06V10/766
CPC classification number: G06F40/58 , G06F40/284 , G06V10/766 , G06V30/10
Abstract: Example embodiments of the present disclosure relate to systems and methods for pretraining image-processing models on weakly-supervised image-text pairs. The pretraining can include receiving a training sequence for the machine-learned image-processing model. The training sequence can include text tokens and image tokens. A prefix sequence can contain the image tokens. A remainder sequence can include a remainder set of the text tokens. The pretraining can include determining, using the prefix sequence as an input to the machine-learned image-processing model, an objective based on recovery of the remainder sequence. The pretraining can include updating one or more learnable parameters of the machine-learned image-processing model based on the objective.
-
公开(公告)号:US20210383223A1
公开(公告)日:2021-12-09
申请号:US17337834
申请日:2021-06-03
Applicant: Google LLC
Inventor: Mingxing Tan , Xuanyi Dong , Wei Yu , Quoc V. Le , Daiyi Peng
Abstract: The present disclosure provides a differentiable joint hyper-parameter and architecture search approach, with some implementations including the idea of discretizing the continuous space into a linear combination of multiple categorical basis. One example element of the proposed approach is the use of weight sharing across all architecture- and hyper-parameters which enables it to search efficiently over the large joint search space. Experimental results on MobileNet/ResNet/EfficientNet/BERT show that the proposed systems significantly improve the accuracy by up to 2% on ImageNet and the F1 by up to 0.4 on SQuAD, with search cost comparable to training a single model. Compared to other AutoML methods, such as random search or Bayesian method, the proposed techniques can achieve better accuracy with 10× less compute cost.
-
-
-
-