Fusing multilayer and multimodal deep neural networks for video classification
Abstract:
A method, computer readable medium, and system are disclosed for classifying video image data. The method includes the steps of processing training video image data by at least a first layer of a convolutional neural network (CNN) to extract a first set of feature maps and generate classification output data for the training video image data. Spatial classification accuracy data is computed based on the classification output data and target classification output data and spatial discrimination factors for the first layer are computed based on the spatial classification accuracies and the first set of feature maps.
Information query
Patent Agency Ranking
0/0