METHOD AND SYSTEM FOR LONG-FORM ANSWER EXTRACTION BASED ON COMBINATION OF SENTENCE INDEX GENERATION TECHNIQUES

    公开(公告)号:US20240126791A1

    公开(公告)日:2024-04-18

    申请号:US18470657

    申请日:2023-09-20

    IPC分类号: G06F16/31 G06F16/332

    CPC分类号: G06F16/31 G06F16/3329

    摘要: This disclosure relates generally to long-form answer extraction and, more particularly, to long-form answer extraction based on combination of sentence index generation techniques. Existing answer extractions techniques have achieved significant progress for extractive short answers; however, less progress has been made for long form questions that require explanations. Further the state-of-art long-answer extractions techniques result in poorer long-form answers or not address sparsity which becomes an issue longer contexts. Additionally, pre-trained generative sequence-to-sequence models are gaining popularity for factoid answer extraction tasks. Hence the disclosure proposes a long-form answer extraction based on several steps including training a set of generative sequence-to-sequence models comprising a sentence indices generation model and a sentence index spans generation. The trained set of generative sequence-to-sequence models is further utilized for model long-form answer extraction based on a union of several sentence index generation techniques comprising a sentence indices and a sentence index spans.