Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation
This work addresses the challenge of improving pixel-level localization in weakly supervised semantic segmentation, which is crucial for applications like image analysis with limited annotations, and it represents a novel approach rather than an incremental improvement.
The paper tackles the problem of weakly supervised semantic segmentation, where classifiers often focus on small discriminative regions, by interpreting this as an information bottleneck caused by the final activation functions. The result is a method that removes these functions and introduces a new pooling technique, achieving state-of-the-art performance with significant improvements in localization maps on PASCAL VOC 2012 and MS COCO 2014 datasets.
Weakly supervised semantic segmentation produces pixel-level localization from class labels; however, a classifier trained on such labels is likely to focus on a small discriminative region of the target object. We interpret this phenomenon using the information bottleneck principle: the final layer of a deep neural network, activated by the sigmoid or softmax activation functions, causes an information bottleneck, and as a result, only a subset of the task-relevant information is passed on to the output. We first support this argument through a simulated toy experiment and then propose a method to reduce the information bottleneck by removing the last activation function. In addition, we introduce a new pooling method that further encourages the transmission of information from non-discriminative regions to the classification. Our experimental evaluations demonstrate that this simple modification significantly improves the quality of localization maps on both the PASCAL VOC 2012 and MS COCO 2014 datasets, exhibiting a new state-of-the-art performance for weakly supervised semantic segmentation. The code is available at: https://github.com/jbeomlee93/RIB.