DEEP 3D ATTENTION LONG SHORT-TERM MEMORY FOR VIDEO-BASED ACTION RECOGNITION
Abstract:
A method, a computer program product, and a system are provided for video based action recognition. The system includes a processor. One or more frames from one or more video sequences are received. A feature vector for each patch of the one w more frames is generated using a deep convolutional neural network. An attention factor for the feature vectors is generated based on a within-frame attention and a between-frame attention. A target action is identified using a multi-layer deep long short-term memory process applied to the attention factor, said target action representing at least one of the one or more video sequences. An operation of a processor-based machine is controlled to change a state of the processor-based machine, responsive to the at least one of the one or more video sequences including the identified target action
Information query
Patent Agency Ranking
0/0