SEMANTIC MULTISENSORY EMBEDDINGS FOR VIDEO SEARCH BY TEXT

Invention Application

WO2017052791A1 SEMANTIC MULTISENSORY EMBEDDINGS FOR VIDEO SEARCH BY TEXT 审中-公开

Title translation: 用于视频搜索的语义多用途嵌入式文本

Please log in to see more content

Patent Title: SEMANTIC MULTISENSORY EMBEDDINGS FOR VIDEO SEARCH BY TEXT
Patent Title (中): 用于视频搜索的语义多用途嵌入式文本
Application No.: PCT/US2016/045353

Application Date: 2016-08-03
Publication No.: WO2017052791A1

Publication Date: 2017-03-30
Inventor: HABIBIAN, Amirhossein , MENSINK, Thomas, Edgar, Josef , SNOEK, Cornelis, Gerardus, Maria
Applicant: QUALCOMM INCORPORATED
Applicant Address: ATTN: International IP Administration 5775 Morehouse Drive San Diego, CA 92121-1714 US
Assignee: QUALCOMM INCORPORATED
Current Assignee: QUALCOMM INCORPORATED
Current Assignee Address: ATTN: International IP Administration 5775 Morehouse Drive San Diego, CA 92121-1714 US
Agency: LENKIN, Alan, M. et al.
Priority: US62/221,569 20150921; US15/080,501 20160324
Main IPC: G06F17/30
IPC: G06F17/30 ; G06N5/00

SEMANTIC MULTISENSORY EMBEDDINGS FOR VIDEO SEARCH BY TEXT

Abstract:

A method of embedding video for text search includes extracting visual features from a video. The visual features may, for example, include appearance information, motion, audio, and/or like features. Term vectors are determined from textual descriptions associated with the video. The text may be included in a title for the video or included within the video (e.g., subtitles), for example. A feature projection is computed based on the extracted video features and a textual projection is computed based on the term vectors. A semantic embedding is computed based on the feature projection and the textual projection by jointly optimizing semantic predictability and semantic descriptiveness.

Abstract(Chinese):

嵌入用于文本搜索的视频的方法包括从视频中提取视觉特征。视觉特征可以例如包括外观信息，运动，音频和/或类似特征。术语向量由与视频关联的文本描述确定。例如，文本可以被包括在视频的标题中或被包括在视频内（例如，字幕）中。基于所提取的视频特征来计算特征投影，并且基于项向量来计算文本投影。通过联合优化语义可预测性和语义描述性，基于特征投影和文本投影来计算语义嵌入。

Information query

Global Dossier Patent Scope Espacenet