FINE-TUNING MULTI-HEAD NETWORK FROM A SINGLE TRANSFORMER LAYER OF PRE-TRAINED LANGUAGE MODEL

    公开(公告)号:US20230115321A1

    公开(公告)日:2023-04-13

    申请号:US17735651

    申请日:2022-05-03

    Abstract: Techniques are provided for customizing or fine-tuning a pre-trained version of a machine-learning model that includes multiple layers and is configured to process audio or textual language input. Each of the multiple layers is configured with a plurality of layer-specific pre-trained parameter values corresponding to a plurality of parameters, and each of the multiple layers is configured to implement multi-head attention. An incomplete subset of the multiple layers is identified for which corresponding layer-specific pre-trained parameter values are to be fine-tuned using a client data set. The machine-learning model is fine-tuned using the client data set to generate an updated version of the machine-learning model, where the layer-specific pre-trained parameter values configured for each layer of one of more of the multiple layers not included in the incomplete subset are frozen during the fine-tuning. Use of the updated version of the machine-learning model is facilitated.

    Techniques for out-of-domain (OOD) detection

    公开(公告)号:US12299402B2

    公开(公告)日:2025-05-13

    申请号:US18659606

    申请日:2024-05-09

    Abstract: The present disclosure relates to techniques for identifying out-of-domain utterances. One particular technique includes receiving an utterance and a target domain of a chatbot, generating a sentence embedding for the utterance, obtaining an embedding representation for each cluster of in-domain utterances associated with the target domain, predicting, using a metric learning model, a first probability that the utterance belongs to the target domain based on a similarity or difference between the sentence embedding and each embedding representation for each cluster, predicting, using an outlier detection model, a second probability that the utterance belongs to the target domain based on a determined distance or density deviation between the sentence embedding and embedding representations for neighboring clusters, evaluating the first probability and the second probability to determine a final probability, and classifying the utterance as in-domain or out-of-domain for the chatbot based on the final probability.

Patent Agency Ranking