CVApr 15, 2021

Learning structure-aware semantic segmentation with image-level supervision

arXiv:2104.07216v18 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of producing accurate semantic segmentation with cheaper image-level supervision, which is incremental by refining existing CAM-based methods.

The paper tackled the problem of lost structure information and inconsistent class activation scores in weakly-supervised semantic segmentation using image-level labels, resulting in improved predictions on the PASCAL-VOC dataset.

Compared with expensive pixel-wise annotations, image-level labels make it possible to learn semantic segmentation in a weakly-supervised manner. Within this pipeline, the class activation map (CAM) is obtained and further processed to serve as a pseudo label to train the semantic segmentation model in a fully-supervised manner. In this paper, we argue that the lost structure information in CAM limits its application in downstream semantic segmentation, leading to deteriorated predictions. Furthermore, the inconsistent class activation scores inside the same object contradicts the common sense that each region of the same object should belong to the same semantic category. To produce sharp prediction with structure information, we introduce an auxiliary semantic boundary detection module, which penalizes the deteriorated predictions. Furthermore, we adopt smoothness loss to encourage prediction inside the object to be consistent. Experimental results on the PASCAL-VOC dataset illustrate the effectiveness of the proposed solution.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes