CURRICULAR NEXT CONVERSATION PREDICTION PRETRAINING FOR TRANSCRIPT SEGMENTATION

    公开(公告)号:US20240362413A1

    公开(公告)日:2024-10-31

    申请号:US18307300

    申请日:2023-04-26

    Applicant: Adobe Inc.

    CPC classification number: G06F40/284 G06F40/30 G06N3/04 G06N3/09

    Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for training and utilizing a transcript segmentation neural network to segment audio recording transcripts. In particular, in one or more embodiments, the disclosed systems implement a pretraining technique for a transcript segmentation neural network based on a specialized dataset that includes contextual information about stored sentences or conversations. For example, the discloses system train a transcript segmentation neural network based on contextual data that indicates semantic similarities and/or distances between sentences of a digital document. In some cases, the disclosed systems also (or alternatively) train a transcript segmentation neural network based on curricular data generated by a classification model. Moreover, in some cases, the disclosed system use the trained transcript segmentation neural network to generate a segmented audio transcript.

Patent Agency Ranking