ISIM: Iterative Self-Improved Model for Weakly Supervised Segmentation
This work addresses the challenge of incomplete segmentation in weakly supervised learning for computer vision researchers, representing an incremental improvement.
The paper tackles the problem of weakly supervised semantic segmentation by proposing an iterative self-improved model that generates more coherent class activation maps and pseudo-segmentation labels, achieving a 2.5% increase in state-of-the-art performance on the Pascal VOC12 dataset.
Weakly Supervised Semantic Segmentation (WSSS) is a challenging task aiming to learn the segmentation labels from class-level labels. In the literature, exploiting the information obtained from Class Activation Maps (CAMs) is widely used for WSSS studies. However, as CAMs are obtained from a classification network, they are interested in the most discriminative parts of the objects, producing non-complete prior information for segmentation tasks. In this study, to obtain more coherent CAMs with segmentation labels, we propose a framework that employs an iterative approach in a modified encoder-decoder-based segmentation model, which simultaneously supports classification and segmentation tasks. As no ground-truth segmentation labels are given, the same model also generates the pseudo-segmentation labels with the help of dense Conditional Random Fields (dCRF). As a result, the proposed framework becomes an iterative self-improved model. The experiments performed with DeepLabv3 and UNet models show a significant gain on the Pascal VOC12 dataset, and the DeepLabv3 application increases the current state-of-the-art metric by %2.5. The implementation associated with the experiments can be found: https://github.com/cenkbircanoglu/isim.