SYSTEMS AND METHODS FOR ENSEMBLING SOFT PROMPTS IN FEW-SHOT FINE-TUNING OF LANGUAGE MODELS

    公开(公告)号:US20240070394A1

    公开(公告)日:2024-02-29

    申请号:US18160967

    申请日:2023-01-27

    CPC classification number: G06F40/284 G06F40/40

    Abstract: Embodiments described herein provide a mechanism that ensembles trainable soft prompts to transfer knowledge from source tasks under few-shot learning settings. Specifically, given a source task input from a source task training dataset, a set of soft prompts may be trained using a frozen PLM on the large-scale source task training dataset. The set of soft prompts are then prepended to a target task input, based on which the frozen pre-trained language model generates a set of logits for predicting classification of the target task input, respectively. An attention module is used to generate input-logit attention scores, which are used to compute a weighted linear combination of the logits given the attention scores. The weighted linear combination are the final logits to predict the final classification of the target task input.

Patent Agency Ranking