-
公开(公告)号:US11741142B2
公开(公告)日:2023-08-29
申请号:US17589522
申请日:2022-01-31
Applicant: salesforce.com, inc.
Inventor: Haopeng Zheng , Semih Yavuz , Wojciech Kryscinski , Kazuma Hashimoto , Yingbo Zhou
IPC: G06F16/34 , G06F40/166 , G06N20/00 , G06F40/117 , G06F40/279
CPC classification number: G06F16/345 , G06F40/166 , G06N20/00 , G06F40/117 , G06F40/279
Abstract: Embodiments described herein provide document summarization systems and methods that utilize fine-tuning of pre-trained abstractive summarization models to produce summaries that more faithfully track the content of the documents. Such abstractive summarization models may be pre-trained using a corpus consisting of pairs of articles and associated summaries. For each article-summary pair, a pseudo label or control code is generated and represents a faithfulness of the summary with respect to the article. The pre-trained model is then fine-tuned based on the article-summary pairs and the corresponding control codes. The resulting fine-tuned models then provide improved faithfulness in document summarization tasks.
-
公开(公告)号:US20230054068A1
公开(公告)日:2023-02-23
申请号:US17589522
申请日:2022-01-31
Applicant: salesforce.com, inc.
Inventor: Haopeng Zheng , Semih Yavuz , Wojciech Kryscinski , Kazuma Hashimoto , Yingbo Zhou
IPC: G06F40/166 , G06F40/279 , G06F40/117 , G06N20/00
Abstract: Embodiments described herein provide document summarization systems and methods that utilize fine-tuning of pre-trained abstractive summarization models to produce summaries that more faithfully track the content of the documents. Such abstractive summarization models may be pre-trained using a corpus consisting of pairs of articles and associated summaries. For each article-summary pair, a pseudo label or control code is generated and represents a faithfulness of the summary with respect to the article. The pre-trained model is then fine-tuned based on the article-summary pairs and the corresponding control codes. The resulting fine-tuned models then provide improved faithfulness in document summarization tasks.
-