CVMay 19, 2021

Railroad is not a Train: Saliency as Pseudo-pixel Supervision for Weakly Supervised Semantic Segmentation

arXiv:2105.08965v1268 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of improving segmentation accuracy with weak supervision for computer vision researchers, representing a strong incremental advance in the field.

The paper tackles limitations in weakly-supervised semantic segmentation, such as sparse object coverage and inaccurate boundaries, by proposing a framework that combines image-level labels and saliency maps to generate high-quality pseudo-masks, achieving state-of-the-art performance on PASCAL VOC 2012 and MS COCO 2014 datasets.

Existing studies in weakly-supervised semantic segmentation (WSSS) using image-level weak supervision have several limitations: sparse object coverage, inaccurate object boundaries, and co-occurring pixels from non-target objects. To overcome these challenges, we propose a novel framework, namely Explicit Pseudo-pixel Supervision (EPS), which learns from pixel-level feedback by combining two weak supervisions; the image-level label provides the object identity via the localization map and the saliency map from the off-the-shelf saliency detection model offers rich boundaries. We devise a joint training strategy to fully utilize the complementary relationship between both information. Our method can obtain accurate object boundaries and discard co-occurring pixels, thereby significantly improving the quality of pseudo-masks. Experimental results show that the proposed method remarkably outperforms existing methods by resolving key challenges of WSSS and achieves the new state-of-the-art performance on both PASCAL VOC 2012 and MS COCO 2014 datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes