CVMay 10, 2023

A Self-Training Framework Based on Multi-Scale Attention Fusion for Weakly Supervised Semantic Segmentation

arXiv:2305.05841v11.5Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of incomplete semantic region extraction in weakly supervised semantic segmentation for computer vision applications, representing an incremental improvement.

The paper tackles the challenge of weakly supervised semantic segmentation with image-level labels by proposing a self-training method using fused multi-scale class-aware attention maps, achieving 72.4% mIoU on PASCAL VOC 2012 validation and test sets.

Weakly supervised semantic segmentation (WSSS) based on image-level labels is challenging since it is hard to obtain complete semantic regions. To address this issue, we propose a self-training method that utilizes fused multi-scale class-aware attention maps. Our observation is that attention maps of different scales contain rich complementary information, especially for large and small objects. Therefore, we collect information from attention maps of different scales and obtain multi-scale attention maps. We then apply denoising and reactivation strategies to enhance the potential regions and reduce noisy areas. Finally, we use the refined attention maps to retrain the network. Experiments showthat our method enables the model to extract rich semantic information from multi-scale images and achieves 72.4% mIou scores on both the PASCAL VOC 2012 validation and test sets. The code is available at https://bupt-ai-cz.github.io/SMAF.

View on arXiv PDF Code

Similar