MULTITRACK EFFECT VISUALIZATION AND INTERACTION FOR TEXT-BASED VIDEO EDITING

    公开(公告)号:US20240233769A1

    公开(公告)日:2024-07-11

    申请号:US18152328

    申请日:2023-01-10

    Applicant: Adobe Inc.

    CPC classification number: G11B27/031 G06F3/04842 G11B27/34

    Abstract: Embodiments of the present disclosure provide systems, methods, and computer storage media providing visualizations and mechanisms utilized when performing video edits using wrapped timelines (e.g., effect bars/effect tracks) interspersed between text lines representing video effects being applied to text segments in a transcript. An example embodiment provides a transcript using an audio track from a transcribed video. A transcript interface presents the transcript and accepts an input selecting sentences or words from the transcript. The identified boundaries corresponding to the selected text segment are used as boundaries for a selected video segment. Using the selected text segment, a user selects a video effect in which to apply to the corresponding video segment and within the transcript interface, a wrapped timeline is placed in the transcript along the selected text segment to indicate that the video effect is applied to the corresponding video segment.

    Face-aware speaker diarization for transcripts and text-based video editing

    公开(公告)号:US12125501B2

    公开(公告)日:2024-10-22

    申请号:US17967399

    申请日:2022-10-17

    Applicant: Adobe Inc.

    CPC classification number: G11B27/031 G06V20/41

    Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for face-aware speaker diarization. In an example embodiment, an audio-only speaker diarization technique is applied to generate an audio-only speaker diarization of a video, an audio-visual speaker diarization technique is applied to generate a face-aware speaker diarization of the video, and the audio-only speaker diarization is refined using the face-aware speaker diarization to generate a hybrid speaker diarization that links detected faces to detected voices. In some embodiments, to accommodate videos with small faces that appear pixelated, a cropped image of any given face is extracted from each frame of the video, and the size of the cropped image is used to select a corresponding active speaker detection model to predict an active speaker score for the face in the cropped image.

    Speaker thumbnail selection and speaker visualization in diarized transcripts for text-based video

    公开(公告)号:US12300272B2

    公开(公告)日:2025-05-13

    申请号:US17967697

    申请日:2022-10-17

    Applicant: Adobe Inc.

    Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for selection of the best image of a particular speaker's face in a video, and visualization in a diarized transcript. In an example embodiment, candidate images of a face of a detected speaker are extracted from frames of a video identified by a detected face track for the face, and a representative image of the detected speaker's face is selected from the candidate images based on image quality, facial emotion (e.g., using an emotion classifier that generates a happiness score), a size factor (e.g., favoring larger images), and/or penalizing images that appear towards the beginning or end of a face track. As such, each segment of the transcript is presented with the representative image of the speaker who spoke that segment and/or input is accepted changing the representative image associated with each speaker.

    User interface creation from screenshots

    公开(公告)号:US10360473B2

    公开(公告)日:2019-07-23

    申请号:US15608641

    申请日:2017-05-30

    Applicant: Adobe Inc.

    Abstract: User interface creation from screenshots is described. Initially, a user captures a screenshot of an existing graphical user interface (GUI). In one or more implementations, the screenshot is processed to generate different types of templates that are modifiable by users to create new GUIs. These different types of templates can include a snapping template, a wireframe template, and a stylized template. The described templates may aid GUI development in different ways depending on the type selected. To generate a template, the screenshot serving as the basis for the template is segmented into groups of pixels corresponding to components of the existing GUI. A type of component is identified for each group of pixels and locations in the screenshot are determined. Based on the identified types of GUI components and determined locations, the user-modifiable template for creating a new GUI is generated.

Patent Agency Ranking