Invention Grant
- Patent Title: Speaker thumbnail selection and speaker visualization in diarized transcripts for text-based video
-
Application No.: US17967697Application Date: 2022-10-17
-
Publication No.: US12300272B2Publication Date: 2025-05-13
- Inventor: Lubomira Assenova Dontcheva , Xue Bai , Aseem Omprakash Agarwala , Joel Richard Brandt
- Applicant: Adobe Inc.
- Applicant Address: US CA San Jose
- Assignee: Adobe Inc.
- Current Assignee: Adobe Inc.
- Current Assignee Address: US CA San Jose
- Agency: Shook, Hardy & Bacon L.L.P.
- Main IPC: G11B27/02
- IPC: G11B27/02 ; G06V20/40 ; G06V40/16

Abstract:
Embodiments of the present invention provide systems, methods, and computer storage media for selection of the best image of a particular speaker's face in a video, and visualization in a diarized transcript. In an example embodiment, candidate images of a face of a detected speaker are extracted from frames of a video identified by a detected face track for the face, and a representative image of the detected speaker's face is selected from the candidate images based on image quality, facial emotion (e.g., using an emotion classifier that generates a happiness score), a size factor (e.g., favoring larger images), and/or penalizing images that appear towards the beginning or end of a face track. As such, each segment of the transcript is presented with the representative image of the speaker who spoke that segment and/or input is accepted changing the representative image associated with each speaker.
Public/Granted literature
- US20240127855A1 SPEAKER THUMBNAIL SELECTION AND SPEAKER VISUALIZATION IN DIARIZED TRANSCRIPTS FOR TEXT-BASED VIDEO Public/Granted day:2024-04-18
Information query
IPC分类: