Invention Application
- Patent Title: SEMANTIC MULTISENSORY EMBEDDINGS FOR VIDEO SEARCH BY TEXT
- Patent Title (中): 用于视频搜索的语义多用途嵌入式文本
-
Application No.: PCT/US2016/045353Application Date: 2016-08-03
-
Publication No.: WO2017052791A1Publication Date: 2017-03-30
- Inventor: HABIBIAN, Amirhossein , MENSINK, Thomas, Edgar, Josef , SNOEK, Cornelis, Gerardus, Maria
- Applicant: QUALCOMM INCORPORATED
- Applicant Address: ATTN: International IP Administration 5775 Morehouse Drive San Diego, CA 92121-1714 US
- Assignee: QUALCOMM INCORPORATED
- Current Assignee: QUALCOMM INCORPORATED
- Current Assignee Address: ATTN: International IP Administration 5775 Morehouse Drive San Diego, CA 92121-1714 US
- Agency: LENKIN, Alan, M. et al.
- Priority: US62/221,569 20150921; US15/080,501 20160324
- Main IPC: G06F17/30
- IPC: G06F17/30 ; G06N5/00
Abstract:
A method of embedding video for text search includes extracting visual features from a video. The visual features may, for example, include appearance information, motion, audio, and/or like features. Term vectors are determined from textual descriptions associated with the video. The text may be included in a title for the video or included within the video (e.g., subtitles), for example. A feature projection is computed based on the extracted video features and a textual projection is computed based on the term vectors. A semantic embedding is computed based on the feature projection and the textual projection by jointly optimizing semantic predictability and semantic descriptiveness.
Information query