-
11.
公开(公告)号:US20170262996A1
公开(公告)日:2017-09-14
申请号:US15250755
申请日:2016-08-29
Applicant: QUALCOMM Incorporated
Inventor: Mihir JAIN , Zhenyang LI , Efstratios GAVVES , Cornelis Gerardus Maria SNOEK
CPC classification number: G06T7/0087 , G06K9/00718 , G06K9/3216 , G06K9/3241 , G06K9/40 , G06K9/4671 , G06K9/628 , G06K2009/00738 , G06N3/0445 , G06N3/0454 , G06T7/143 , G06T2207/10016 , G06T2210/12
Abstract: A method generates bounding-boxes within frames of a sequence of frames. The bounding-boxes may be generated via a recurrent neural network (RNN) such as a long short-term memory (LSTM) network. The method includes receiving the sequence of frames and generating an attention feature map for each frame of the sequence of frames. Each attention feature map indicates at least one potential moving object. The method also includes up-sampling each attention feature map to determine an attention saliency for pixels in each frame of the sequence of frames. The method further includes generating a bounding-box within each frame based on the attention saliency and temporally smoothing multiple bounding-boxes along the sequence of frames to obtain a smooth sequence of bounding-boxes. The method still further includes localizing an action location within each frame based on the smooth sequence of bounding-boxes.
-
公开(公告)号:US20170262995A1
公开(公告)日:2017-09-14
申请号:US15249280
申请日:2016-08-26
Applicant: QUALCOMM Incorporated
Inventor: Zhenyang LI , Efstratios GAVVES , Mihir JAIN , Cornelis Gerardus Maria SNOEK
CPC classification number: G06T7/11 , G06K9/00335 , G06K9/00718 , G06N3/0445 , G06N3/0454 , G06N3/08 , G06T7/0081 , G06T2207/10004 , G06T2207/20084
Abstract: A method of processing data within a convolutional attention recurrent neural network (RNN) includes generating a current multi-dimensional attention map. The current multi-dimensional attention map indicates areas of interest in a first frame from a sequence of spatio-temporal data. The method further includes receiving a multi-dimensional feature map. The method also includes convolving the current multi-dimensional attention map and the multi-dimensional feature map to obtain a multi-dimensional hidden state and a next multi-dimensional attention map. The method identifies a class of interest in the first frame based on the multi-dimensional hidden state and training data.
-