Invention Application
WO2017052791A1 SEMANTIC MULTISENSORY EMBEDDINGS FOR VIDEO SEARCH BY TEXT 审中-公开
用于视频搜索的语义多用途嵌入式文本

SEMANTIC MULTISENSORY EMBEDDINGS FOR VIDEO SEARCH BY TEXT
Abstract:
A method of embedding video for text search includes extracting visual features from a video. The visual features may, for example, include appearance information, motion, audio, and/or like features. Term vectors are determined from textual descriptions associated with the video. The text may be included in a title for the video or included within the video (e.g., subtitles), for example. A feature projection is computed based on the extracted video features and a textual projection is computed based on the term vectors. A semantic embedding is computed based on the feature projection and the textual projection by jointly optimizing semantic predictability and semantic descriptiveness.
Patent Agency Ranking
0/0