Training a model for performing abstractive text summarization

    公开(公告)号:US12164550B2

    公开(公告)日:2024-12-10

    申请号:US17651352

    申请日:2022-02-16

    Applicant: Adobe Inc.

    Abstract: Techniques for training for and performing abstractive text summarization are disclosed. Such techniques include, in some embodiments, obtaining textual content, and generating a reconstruction of the textual content using a trained language model, the reconstructed textual content comprising an abstractive summary of the textual content generated based on relative importance parameters associated with respective portions of the textual content. In some cases, the trained language model includes a neural network language model that has been trained by identifying a plurality of discrete portions of training textual content, receiving the plurality of discrete portions of the training textual content as input to the language model, and predicting relative importance parameters associated with respective ones of the plurality of discrete portions of the training textual content, the relative importance parameters each being based at least on one or more linguistic similarity measures with respect to a ground truth.

    CURRICULAR NEXT CONVERSATION PREDICTION PRETRAINING FOR TRANSCRIPT SEGMENTATION

    公开(公告)号:US20240362413A1

    公开(公告)日:2024-10-31

    申请号:US18307300

    申请日:2023-04-26

    Applicant: Adobe Inc.

    CPC classification number: G06F40/284 G06F40/30 G06N3/04 G06N3/09

    Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for training and utilizing a transcript segmentation neural network to segment audio recording transcripts. In particular, in one or more embodiments, the disclosed systems implement a pretraining technique for a transcript segmentation neural network based on a specialized dataset that includes contextual information about stored sentences or conversations. For example, the discloses system train a transcript segmentation neural network based on contextual data that indicates semantic similarities and/or distances between sentences of a digital document. In some cases, the disclosed systems also (or alternatively) train a transcript segmentation neural network based on curricular data generated by a classification model. Moreover, in some cases, the disclosed system use the trained transcript segmentation neural network to generate a segmented audio transcript.

    TRAINING A MODEL FOR PERFORMING ABSTRACTIVE TEXT SUMMARIZATION

    公开(公告)号:US20230259544A1

    公开(公告)日:2023-08-17

    申请号:US17651352

    申请日:2022-02-16

    Applicant: Adobe Inc.

    Abstract: Techniques for training for and performing abstractive text summarization are disclosed. Such techniques include, in some embodiments, obtaining textual content, and generating a reconstruction of the textual content using a trained language model, the reconstructed textual content comprising an abstractive summary of the textual content generated based on relative importance parameters associated with respective portions of the textual content. In some cases, the trained language model includes a neural network language model that has been trained by identifying a plurality of discrete portions of training textual content, receiving the plurality of discrete portions of the training textual content as input to the language model, and predicting relative importance parameters associated with respective ones of the plurality of discrete portions of the training textual content, the relative importance parameters each being based at least on one or more linguistic similarity measures with respect to a ground truth.

Patent Agency Ranking