-
公开(公告)号:US12164550B2
公开(公告)日:2024-12-10
申请号:US17651352
申请日:2022-02-16
Applicant: Adobe Inc.
Inventor: Sajad Sotudeh Gharebagh , Hanieh Deilamsalehy , Franck Dernoncourt
Abstract: Techniques for training for and performing abstractive text summarization are disclosed. Such techniques include, in some embodiments, obtaining textual content, and generating a reconstruction of the textual content using a trained language model, the reconstructed textual content comprising an abstractive summary of the textual content generated based on relative importance parameters associated with respective portions of the textual content. In some cases, the trained language model includes a neural network language model that has been trained by identifying a plurality of discrete portions of training textual content, receiving the plurality of discrete portions of the training textual content as input to the language model, and predicting relative importance parameters associated with respective ones of the plurality of discrete portions of the training textual content, the relative importance parameters each being based at least on one or more linguistic similarity measures with respect to a ground truth.
-
2.
公开(公告)号:US20250022459A1
公开(公告)日:2025-01-16
申请号:US18220910
申请日:2023-07-12
Applicant: Adobe Inc.
Inventor: Viet Dac Lai , Trung Bui , Seunghyun Yoon , Quan Tran , Hao Tan , Hanieh Deilamsalehy , Abel Salinas , Franck Dernoncourt
IPC: G10L15/183 , G10L15/065
Abstract: The disclosed method generates helpful training data for a language model, for example, a model implementing a punctuation restoration task, for real-world ASR texts. The method uses a reinforcement learning method using a generative AI model to generate additional data to train the language model. The method allows the generative AI model to learn from real-world ASR text to generate more effective training examples based on gradient feedback from the language model.
-
公开(公告)号:US20240362413A1
公开(公告)日:2024-10-31
申请号:US18307300
申请日:2023-04-26
Applicant: Adobe Inc.
Inventor: Anvesh Rao Vijjini , Hanieh Deilamsalehy , Franck Dernoncourt
IPC: G06F40/284 , G06F40/30 , G06N3/04 , G06N3/09
CPC classification number: G06F40/284 , G06F40/30 , G06N3/04 , G06N3/09
Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for training and utilizing a transcript segmentation neural network to segment audio recording transcripts. In particular, in one or more embodiments, the disclosed systems implement a pretraining technique for a transcript segmentation neural network based on a specialized dataset that includes contextual information about stored sentences or conversations. For example, the discloses system train a transcript segmentation neural network based on contextual data that indicates semantic similarities and/or distances between sentences of a digital document. In some cases, the disclosed systems also (or alternatively) train a transcript segmentation neural network based on curricular data generated by a classification model. Moreover, in some cases, the disclosed system use the trained transcript segmentation neural network to generate a segmented audio transcript.
-
公开(公告)号:US20230259708A1
公开(公告)日:2023-08-17
申请号:US17650876
申请日:2022-02-14
Applicant: ADOBE INC.
Inventor: Amir Pouran Ben Veyseh , Franck Dernoncourt , Walter W. Chang , Trung Huu Bui , Hanieh Deilamsalehy , Seunghyun Yoon , Rajiv Bhawanji Jain , Quan Hung Tran , Varun Manjunatha
IPC: G06F40/289 , G06F40/30 , G10L15/22 , G10L15/06 , G10L15/16
CPC classification number: G06F40/289 , G06F40/30 , G10L15/22 , G10L15/063 , G10L15/16 , G10L2015/0635
Abstract: Systems and methods for key-phrase extraction are described. The systems and methods include receiving a transcript including a text paragraph and generating key-phrase data for the text paragraph using a key-phrase extraction network. The key-phrase extraction network is trained to identify domain-relevant key-phrase data based on domain data obtained using a domain discriminator network. The systems and methods further include generating meta-data for the transcript based on the key-phrase data.
-
公开(公告)号:US20230259544A1
公开(公告)日:2023-08-17
申请号:US17651352
申请日:2022-02-16
Applicant: Adobe Inc.
Inventor: Sajad Sotudeh Gharebagh , Hanieh Deilamsalehy , Franck Dernoncourt
CPC classification number: G06F16/345 , G06K9/6215 , G06K9/6262 , G06N3/08 , G06N3/04 , G06F16/3334
Abstract: Techniques for training for and performing abstractive text summarization are disclosed. Such techniques include, in some embodiments, obtaining textual content, and generating a reconstruction of the textual content using a trained language model, the reconstructed textual content comprising an abstractive summary of the textual content generated based on relative importance parameters associated with respective portions of the textual content. In some cases, the trained language model includes a neural network language model that has been trained by identifying a plurality of discrete portions of training textual content, receiving the plurality of discrete portions of the training textual content as input to the language model, and predicting relative importance parameters associated with respective ones of the plurality of discrete portions of the training textual content, the relative importance parameters each being based at least on one or more linguistic similarity measures with respect to a ground truth.
-
-
-
-