-
1.
公开(公告)号:US20240169732A1
公开(公告)日:2024-05-23
申请号:US18227560
申请日:2023-07-28
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Mikita DVORNIK , Isma Hadji , Ran Zhang , Konstantinos Derpanis , Richard Wildes , Animesh Garg , Allan Douglas Jepson
IPC: G06V20/40 , G06F16/735 , G06F16/783 , G06V10/77 , G06V10/86 , G09B5/06
CPC classification number: G06V20/46 , G06F16/735 , G06F16/7844 , G06V10/7715 , G06V10/86 , G06V20/48 , G09B5/065
Abstract: The present disclosure provides methods, apparatuses, and computer-readable media for step discovery and localization in an instructional video. In some embodiments, the method includes extracting, from the instructional video using a transformer model, a plurality of step slots corresponding to a plurality of procedure steps depicted in the instructional video, matching, using an order-aware sequence-to-sequence alignment model, a plurality of video segments of the instructional video to the plurality of step slots, generating a temporally-ordered plurality of video segments from the plurality of video segments, receiving a user query requesting a procedure step, selecting, from the plurality of video segments of the instructional video, a corresponding video segment corresponding to the requested procedure step, and providing, in response to the user query, the corresponding video segment and the matching textual step description of the corresponding video segment.