Invention Grant
- Patent Title: Video retrieval system using adaptive spatiotemporal convolution feature representation with dynamic abstraction for video to language translation
-
Application No.: US15794802Application Date: 2017-10-26
-
Publication No.: US10402658B2Publication Date: 2019-09-03
- Inventor: Renqiang Min , Yunchen Pu
- Applicant: NEC Laboratories America, Inc.
- Applicant Address: JP
- Assignee: NEC Corporation
- Current Assignee: NEC Corporation
- Current Assignee Address: JP
- Agent Joseph Kolodka
- Main IPC: G06K9/00
- IPC: G06K9/00 ; G06K9/46 ; G06N3/04 ; G06K9/66 ; H04N5/278 ; G06K9/62 ; H04N21/218 ; H04N21/234 ; H04N21/488 ; G06K9/72 ; H04N7/18

Abstract:
A video retrieval system is provided, that includes a set of servers, configured to retrieve a video sequence from a database and forward it to a requesting device responsive to a match between an input text and a caption for the video sequence. The servers are further configured to translate the video sequence into the caption by (A) applying a C3D to image frames of the video sequence to obtain therefor (i) intermediate feature representations across L convolutional layers and (ii) top-layer features, (B) producing a first word of the caption for the video sequence by applying the top-layer features to a LSTM, and (C) producing subsequent words of the caption by (i) dynamically performing spatiotemporal attention and layer attention using the representations to form a context vector, and (ii) applying the LSTM to the context vector, a previous word of the caption, and a hidden state of the LSTM.
Public/Granted literature
Information query