Training text summarization neural networks with an extracted segments prediction objective
Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a text summarization neural network. One of the methods includes pre-training the text summarization neural network including learning values of a plurality of network parameters through self-supervised learning using unlabeled data comprising unlabeled first texts, the pre-training including: obtaining an unlabeled first text comprising a plurality of segments; selecting one or more of the plurality of segments; processing a masked first text that excludes the one or more selected segments to generate a prediction of the one or more selected segments; and determining, based on a difference between the prediction and the one or more selected segments, an update to the current values of the plurality of network parameters; adapting the pre-trained text summarization neural network for a specific text summarization task using labeled data comprising second texts and respective summaries of the second texts.
Information query
Patent Agency Ranking
0/0