MODEL TRAINING METHOD AND RELATED DEVICE

    公开(公告)号:US20240428070A1

    公开(公告)日:2024-12-26

    申请号:US18809757

    申请日:2024-08-20

    Abstract: A method of model training is disclosed. The method includes: obtaining a second embedding vector input to a decoder in a pre-trained language model, where the second embedding vector corresponds to a second data sequence. The second data sequence includes first sub-data, a masked to-be-predicted data unit, and second sub-data. The first sub-data is located before the masked to-be-predicted data unit in the second data sequence, and the second sub-data is located after the masked to-be-predicted data unit in the second data sequence. The method further includes: obtaining a hidden state based on a first embedding vector by using an encoder in the pre-trained language model (PLM); and predicting the masked to-be-predicted data unit based on the first sub-data, the second sub-data, and the hidden state by using the decoder in the PLM and an output layer of the decoder.

Patent Agency Ranking