Learning Transferrable Knowledge for Semantic Segmentation with Deep Convolutional Neural Network
This addresses the problem of reducing annotation costs for semantic segmentation in computer vision, though it is incremental as it builds on existing weakly-supervised methods.
The paper tackles weakly-supervised semantic segmentation by transferring segmentation knowledge from auxiliary categories with full annotations to images with only image-level labels, achieving substantially improved performance on the PASCAL VOC 2012 dataset using annotations from 60 exclusive categories in Microsoft COCO.
We propose a novel weakly-supervised semantic segmentation algorithm based on Deep Convolutional Neural Network (DCNN). Contrary to existing weakly-supervised approaches, our algorithm exploits auxiliary segmentation annotations available for different categories to guide segmentations on images with only image-level class labels. To make the segmentation knowledge transferrable across categories, we design a decoupled encoder-decoder architecture with attention model. In this architecture, the model generates spatial highlights of each category presented in an image using an attention model, and subsequently generates foreground segmentation for each highlighted region using decoder. Combining attention model, we show that the decoder trained with segmentation annotations in different categories can boost the performance of weakly-supervised semantic segmentation. The proposed algorithm demonstrates substantially improved performance compared to the state-of-the-art weakly-supervised techniques in challenging PASCAL VOC 2012 dataset when our model is trained with the annotations in 60 exclusive categories in Microsoft COCO dataset.