Spatial attention model for image captioning

Invention Grant

US10558750B2 Spatial attention model for image captioning 有权

Please log in to see more content

Patent Title: Spatial attention model for image captioning
Application No.: US15817153

Application Date: 2017-11-17
Publication No.: US10558750B2

Publication Date: 2020-02-11
Inventor: Jiasen Lu , Caiming Xiong , Richard Socher
Applicant: salesforce.com, inc.
Applicant Address: US CA San Francisco
Assignee: salesforce.com, inc.
Current Assignee: salesforce.com, inc.
Current Assignee Address: US CA San Francisco
Agency: Haynes and Boone, LLP.
Main IPC: G06F17/27
IPC: G06F17/27 ; G06K9/00 ; G06K9/62 ; G06K9/46 ; G06F17/24 ; G06K9/48 ; G06K9/66 ; G06N3/08

Spatial attention model for image captioning

Abstract:

The technology disclosed presents a novel spatial attention model that uses current hidden state information of a decoder long short-term memory (LSTM) to guide attention and to extract spatial image features for use in image captioning. The technology disclosed also presents a novel adaptive attention model for image captioning that mixes visual information from a convolutional neural network (CNN) and linguistic information from an LSTM. At each timestep, the adaptive attention model automatically decides how heavily to rely on the image, as opposed to the linguistic model, to emit the next caption word. The technology disclosed further adds a new auxiliary sentinel gate to an LSTM architecture and produces a sentinel LSTM (Sn-LSTM). The sentinel gate produces a visual sentinel at each timestep, which is an additional representation, derived from the LSTM's memory, of long and short term visual and linguistic information.

Public/Granted literature

US20180143966A1 Spatial Attention Model for Image Captioning Public/Granted day:2018-05-24

Information query

Espacenet