CVLGMay 13, 2021

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

arXiv:2105.06464v296 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of reducing annotation costs for computer vision tasks, offering a weakly supervised solution that is incremental in improving performance over existing methods.

The paper tackles the problem of learning instance segmentation and semantic correspondence using only bounding box supervision, achieving 37.9% AP on COCO instance segmentation, which surpasses prior weakly supervised methods and is competitive with supervised approaches.

We introduce DiscoBox, a novel framework that jointly learns instance segmentation and semantic correspondence using bounding box supervision. Specifically, we propose a self-ensembling framework where instance segmentation and semantic correspondence are jointly guided by a structured teacher in addition to the bounding box supervision. The teacher is a structured energy model incorporating a pairwise potential and a cross-image potential to model the pairwise pixel relationships both within and across the boxes. Minimizing the teacher energy simultaneously yields refined object masks and dense correspondences between intra-class objects, which are taken as pseudo-labels to supervise the task network and provide positive/negative correspondence pairs for dense constrastive learning. We show a symbiotic relationship where the two tasks mutually benefit from each other. Our best model achieves 37.9% AP on COCO instance segmentation, surpassing prior weakly supervised methods and is competitive to supervised methods. We also obtain state of the art weakly supervised results on PASCAL VOC12 and PF-PASCAL with real-time inference.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes