-
公开(公告)号:US20250053751A1
公开(公告)日:2025-02-13
申请号:US18413495
申请日:2024-01-16
Applicant: GOOGLE LLC
Inventor: Oscar Akerlund , Evgeny Sluzhaev , Golnaz Ghiasi , Thang Luong , Yifeng Lu , Igor Petrovski , Agoston Weisz , Wei Yu , Rakesh Shivanna , Michael Andrew Goodman , Apoorv Kulshreshtha , Yu Du , Amin Ghafouri , Sanil Jain , Dustin Tran , Vikas Peswani , YaGuang Li
IPC: G06F40/40
Abstract: Implementations relate to generating multi-modal response(s) through utilization of large language model(s) (LLM(s)). Processor(s) of a system can: receive natural language (NL) based input, generate a multi-modal response that is responsive to the NL based output, and cause the multi-modal response to be rendered. In some implementations, and in generating the multi-modal response, the processor(s) can process, using a LLM, LLM input (e.g., that includes at least the NL based input) to generate LLM output, and determine, based on the LLM output, textual content for inclusion in the multi-modal response and multimedia content for inclusion in the multi-modal response. In some implementations, the multimedia content can be obtained based on a multimedia content tag that is included in the LLM output and that is indicative of the multimedia content. In various implementations, the multimedia content can be interleaved between segments of the textual content.
-
公开(公告)号:US20240282131A1
公开(公告)日:2024-08-22
申请号:US18421672
申请日:2024-01-24
Applicant: Google LLC
Inventor: Jie Ren , Zhe Liu , James Urquhart Allingham , Michael Ward Dusenberry , Dustin Tran , Yin Cui , Balaji Lakshminarayanan , Xiuye Gu
IPC: G06V20/70 , G06F40/40 , G06V10/74 , G06V10/764 , G06V10/776
CPC classification number: G06V20/70 , G06F40/40 , G06V10/761 , G06V10/764 , G06V10/776
Abstract: Systems and methods for zero-shot prompt ensembling for zero-shot classification with text-image models can include utilizing a pre-trained text-image model to perform downstream tasks based on prompt-based weighting. The systems and methods may adjust for frequency-based bias and may automatically determine different prompt associations with a given downstream task. The systems and methods can aggregate weighted text embeddings and then determine a classification output based on similarity measures between an image embedding and the aggregated weighted text embeddings.
-
公开(公告)号:US11907674B1
公开(公告)日:2024-02-20
申请号:US18370683
申请日:2023-09-20
Applicant: GOOGLE LLC
Inventor: Oscar Akerlund , Evgeny Sluzhaev , Golnaz Ghiasi , Thang Luong , Yifeng Lu , Igor Petrovski , Ágoston Weisz , Wei Yu , Rakesh Shivanna , Michael Andrew Goodman , Apoorv Kulshreshtha , Yu Du , Amin Ghafouri , Sanil Jain , Dustin Tran , Vikas Peswani , YaGuang Li
CPC classification number: G06F40/40
Abstract: Implementations relate to generating multi-modal response(s) through utilization of large language model(s) (LLM(s)). Processor(s) of a system can: receive natural language (NL) based input, generate a multi-modal response that is responsive to the NL based output, and cause the multi-modal response to be rendered. In some implementations, and in generating the multi-modal response, the processor(s) can process, using a LLM, LLM input (e.g., that includes at least the NL based input) to generate LLM output, and determine, based on the LLM output, textual content for inclusion in the multi-modal response and multimedia content for inclusion in the multi-modal response. In some implementations, the multimedia content can be obtained based on a multimedia content tag that is included in the LLM output and that is indicative of the multimedia content. In various implementations, the multimedia content can be interleaved between segments of the textual content.
-
公开(公告)号:US20230206030A1
公开(公告)日:2023-06-29
申请号:US18008404
申请日:2021-06-07
Applicant: Google LLC
Inventor: Rodolphe Jenatton , Florian Wenzel , Dustin Tran
IPC: G06N3/045 , G06N3/0985
CPC classification number: G06N3/045 , G06N3/0985
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an ensemble of neural networks. In particular, the neural networks in the ensemble are trained using different hyperparameters from one another.
-
-
-