Jointly modeling embedding and translation to bridge video and language

Invention Grant

US09807473B2 Jointly modeling embedding and translation to bridge video and language 有权

Please log in to see more content

Patent Title: Jointly modeling embedding and translation to bridge video and language
Application No.: US14946988

Application Date: 2015-11-20
Publication No.: US09807473B2

Publication Date: 2017-10-31
Inventor: Tao Mei , Ting Yao , Yong Rui
Applicant: Microsoft Technology Licensing, LLC
Applicant Address: US WA Redmond
Assignee: Microsoft Technology Licensing, LLC
Current Assignee: Microsoft Technology Licensing, LLC
Current Assignee Address: US WA Redmond
Main IPC: H04N5/445
IPC: H04N5/445 ; H04N21/8405 ; G06F17/27 ; G06K9/00 ; G06N3/08

Jointly modeling embedding and translation to bridge video and language

Abstract:

Video description generation using neural network training based on relevance and coherence is described. In some examples, long short-term memory with visual-semantic embedding (LSTM-E) can maximize the probability of generating the next word given previous words and visual content and can create a visual-semantic embedding space for enforcing the relationship between the semantics of an entire sentence and visual content. LSTM-E can include a 2-D and/or 3-D deep convolutional neural networks for learning powerful video representation, a deep recurrent neural network for generating sentences, and a joint embedding model for exploring the relationships between visual content and sentence semantics.

Public/Granted literature

US20170150235A1 Jointly Modeling Embedding and Translation to Bridge Video and Language Public/Granted day:2017-05-25

Information query

Espacenet

IPC分类:

H	电学
H04	电通信技术
H04N	图像通信，如电视
H04N5/00	电视系统的零部件（扫描部件或其与供电电压产生的组合入H04N3/00）
H04N5/44	.接收机电路（H04N5/14优先）
H04N5/445	..用于显示附加信息的（H04N5/50优先）