Image sequence processing using neural networks
Abstract:
A recurrent multi-task CNN with an encoder and multiple decoders infers single value output and dense (image) outputs such as heatmaps and segmentation masks. Recurrence is obtained by reinjecting (with mere concatenation) heatmaps or masks (or intermediate feature maps) to a next input image (or to next intermediate feature maps) for a next CNN inference. The inference outputs may be refined using cascaded refiner blocks specifically trained. Virtual annotation for training video sequences can be obtained using computer analysis. Benefits of these approaches allows the depth of the CNN, i.e. the number of layers, to be reduced. They also avoid parallel independent inferences to be run for different tasks, while keeping similar prediction quality. Multiple task inferences are useful for Augmented Reality applications.
Public/Granted literature
Information query
Patent Agency Ranking
0/0