-
公开(公告)号:US11755909B2
公开(公告)日:2023-09-12
申请号:US17805758
申请日:2022-06-07
IPC分类号: G06N3/08 , G06F40/20 , G06F40/166 , G06F40/216 , G06F16/34 , G06F40/30 , G06F30/27 , G06N5/025 , G06F40/103 , G06F17/00
CPC分类号: G06N3/08 , G06F40/166 , G06F40/20 , G06F16/345 , G06F30/27 , G06F40/103 , G06F40/216 , G06F40/30 , G06N5/025
摘要: There is provided a method and a system for training an extractive machine learning algorithm (MLA) to generate extractive summaries of text documents. Reference documents and associated extractive summaries are received. The extractive MLA is then trained to generate an extractive summary, where the training includes, for a given reference document, encoding, using a sentence encoder, a plurality of reference sentences to obtain an associated plurality of sentence representations, encoding, using a document encoder, the associated plurality of sentence representations to obtain a document representation, extracting, using a decoder and based on the associated plurality of sentence representations and the document representation, a first reference sentence of the plurality of reference sentences to obtain a first extracted sentence. A given parameter is updated based on the first extracted sentence and the given reference document summary. A trained extractive MLA comprising the updated given parameter is output.