-
公开(公告)号:US20240362413A1
公开(公告)日:2024-10-31
申请号:US18307300
申请日:2023-04-26
Applicant: Adobe Inc.
Inventor: Anvesh Rao Vijjini , Hanieh Deilamsalehy , Franck Dernoncourt
IPC: G06F40/284 , G06F40/30 , G06N3/04 , G06N3/09
CPC classification number: G06F40/284 , G06F40/30 , G06N3/04 , G06N3/09
Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for training and utilizing a transcript segmentation neural network to segment audio recording transcripts. In particular, in one or more embodiments, the disclosed systems implement a pretraining technique for a transcript segmentation neural network based on a specialized dataset that includes contextual information about stored sentences or conversations. For example, the discloses system train a transcript segmentation neural network based on contextual data that indicates semantic similarities and/or distances between sentences of a digital document. In some cases, the disclosed systems also (or alternatively) train a transcript segmentation neural network based on curricular data generated by a classification model. Moreover, in some cases, the disclosed system use the trained transcript segmentation neural network to generate a segmented audio transcript.