CVApr 4, 2021

Weakly-supervised Instance Segmentation via Class-agnostic Learning with Salient Images

arXiv:2104.01526v143 citationsHas Code
Originality Highly original
AI Analysis

This addresses the problem of reducing annotation costs for instance segmentation in computer vision, offering a practical solution with strong performance gains.

The paper tackles weakly-supervised instance segmentation by proposing a method that uses box-supervised and salient images to train a class-agnostic segmentation model, which then refines masks to train a Mask R-CNN. The result is that this approach matches fully-supervised Mask R-CNN on PASCAL VOC and significantly outperforms previous state-of-the-art box-supervised methods on COCO, using only 7991 salient images.

Humans have a strong class-agnostic object segmentation ability and can outline boundaries of unknown objects precisely, which motivates us to propose a box-supervised class-agnostic object segmentation (BoxCaseg) based solution for weakly-supervised instance segmentation. The BoxCaseg model is jointly trained using box-supervised images and salient images in a multi-task learning manner. The fine-annotated salient images provide class-agnostic and precise object localization guidance for box-supervised images. The object masks predicted by a pretrained BoxCaseg model are refined via a novel merged and dropped strategy as proxy ground truth to train a Mask R-CNN for weakly-supervised instance segmentation. Only using $7991$ salient images, the weakly-supervised Mask R-CNN is on par with fully-supervised Mask R-CNN on PASCAL VOC and significantly outperforms previous state-of-the-art box-supervised instance segmentation methods on COCO. The source code, pretrained models and datasets are available at \url{https://github.com/hustvl/BoxCaseg}.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes