Patent search ap:("Adobe Inc.") AND inv:"Didac SURIS COLL-VINENT" Page 1

1.

发明公开
SELF-SUPERVISED AUDIO-VISUAL LEARNING FOR CORRELATING MUSIC AND VIDEO 审中-公开

公开(公告)号：US20230368503A1

公开(公告)日：2023-11-16

申请号：US17742322

申请日：2022-05-11

Applicant: Adobe Inc.

Inventor： Justin SALAMON , Bryan RUSSELL , Didac SURIS COLL-VINENT

IPC: G06V10/774 , G06V20/40 , G06V10/74 , G10L25/57 , G10L25/03

CPC classification number: G06V10/774 , G06V20/49 , G06V20/46 , G06V10/761 , G10L25/57 , G10L25/03

Abstract: Embodiments are disclosed for correlating video sequences and audio sequences by a media recommendation system using a trained encoder network. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving a training input including a media sequence, including a video sequence paired with an audio sequence, segmenting the media sequence into a set of video sequence segments and a set of audio sequence segments, extracting visual features for each video sequence segment and audio features for each audio sequence segment, generating, by transformer networks, contextualized visual features from the extracted visual features and contextualized audio features from the extracted audio features, the transformer networks including a visual transformer and an audio transformer, generating predicted video and audio sequence segment pairings based on the contextualized visual and audio features, and training the visual transformer and the audio transformer to generate the contextualized visual and audio features.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification