Graph convolutional networks for video grounding
Abstract:
Method and apparatus that includes receiving a query describing an aspect in a video, the video including a plurality of frames, identifying multiple proposals that potentially correspond to the query where each of the proposals includes a subset of the plurality of frames, ranking the proposals using a graph convolution network that identifies relationships between the proposals, and selecting, based on the ranking, one of the proposals as a video segment that correlates to the query.
Public/Granted literature
Information query
Patent Agency Ranking
0/0