Jointly Modeling Embedding and Translation to Bridge Video and Language

Invention Application

US20170150235A1 Jointly Modeling Embedding and Translation to Bridge Video and Language 有权

Please log in to see more content

Patent Title: Jointly Modeling Embedding and Translation to Bridge Video and Language
Application No.: US14946988

Application Date: 2015-11-20
Publication No.: US20170150235A1

Publication Date: 2017-05-25
Inventor: Tao Mei , Ting Yao , Yong Rui
Applicant: Microsoft Technology Licensing, LLC
Main IPC: H04N21/8405
IPC: H04N21/8405 ; G06K9/00 ; G06N3/08 ; G06F17/27

Jointly Modeling Embedding and Translation to Bridge Video and Language

Abstract:

Video description generation using neural network training based on relevance and coherence is described. In some examples, long short-term memory with visual-semantic embedding (LSTM-E) can maximize the probability of generating the next word given previous words and visual content and can create a visual-semantic embedding space for enforcing the relationship between the semantics of an entire sentence and visual content. LSTM-E can include a 2-D and/or 3-D deep convolutional neural networks for learning powerful video representation, a deep recurrent neural network for generating sentences, and a joint embedding model for exploring the relationships between visual content and sentence semantics.

Public/Granted literature

US09807473B2 Jointly modeling embedding and translation to bridge video and language Public/Granted day:2017-10-31

Information query

Global Dossier Espacenet

IPC分类:

H	电学
H04	电通信技术
H04N	图像通信，如电视
H04N21/00	可选的内容分发，例如交互式电视,或视频点播[VOD]（运动视频数据的实时双向传输入H04N7/14）
H04N21/80	.通过内容产生器独立于分配过程实现的内容或附加数据的生成或处理；内容本身
H04N21/83	..与内容相关的保护性或者描述性数据的生成或处理；内容架构
H04N21/84	...描述性数据的生成或处理，例如内容描述器
H04N21/8405	....由关键字表示