-
公开(公告)号:US20240428015A1
公开(公告)日:2024-12-26
申请号:US18386343
申请日:2023-11-02
Applicant: Google LLC
Inventor: Jinsung Yoon , Jiefeng Chen , Sayna Ebrahimi , Sercan Omer Arik
IPC: G06F40/40
Abstract: Aspects of the disclosure are directed to methods, systems, and computer readable media for adaptation with self-evaluation to improve selective prediction in large language models (LLMs), generally referred to as ASPIRE. ASPIRE includes training LLMs on a portion of training data from a question answering task to learn self-evaluation, e.g., learn to distinguish whether a generated answer is correct or not. ASPIRE further includes a selection score that combines a likelihood of that generated answer is correct with a self-evaluation score for selective prediction. ASPIRE demonstrates improved selective prediction performance with less computational cost.
-
公开(公告)号:US20240249204A1
公开(公告)日:2024-07-25
申请号:US18419476
申请日:2024-01-22
Applicant: Google LLC
Inventor: Jinsung Yoon , Jiefeng Chen , Sayna Ebrahimi , Sercan Omer Arik
IPC: G06N20/20
CPC classification number: G06N20/20
Abstract: A method includes obtaining a set of unlabeled test data samples and, for each respective initial training step, determining a first average output for each unlabeled test data sample using a deep ensemble. For each round of a plurality of rounds, the method includes selecting a subset of unlabeled test data samples based on the determined first average outputs, labeling each respective unlabeled in the subset of unlabeled test data samples, fine-tuning the deep ensemble model using the subset of labeled test data samples, and determining a second average output for each unlabeled test data sample using the fine-tuned deep ensemble model. The method also includes generating, using the set of unlabeled test data samples and the determined second average outputs, a pseudo-labeled set of training data samples. The method also includes training the deep ensemble model using the pseudo-labeled set of training data samples.
-