METHOD AND APPARATUS FOR PRE-TRAINING SEMANTIC REPRESENTATION MODEL AND ELECTRONIC DEVICE
Abstract:
A method for pre-training a semantic representation model includes: for each video-text pair in pre-training data, determining a mask image sequence, a mask character sequence, and a mask image-character sequence of the video-text pair; determining a plurality of feature sequences and mask position prediction results respectively corresponding to the plurality of feature sequences by inputting the mask image sequence, the mask character sequence, and the mask image-character sequence into an initial semantic representation model; and building a loss function based on the plurality of feature sequences, the mask position prediction results respectively corresponding to the plurality of feature sequences and true mask position results, and adjusting coefficients of the semantic representation model to realize training.
Information query
Patent Agency Ranking
0/0