CVJun 6, 2015

What's the Point: Semantic Segmentation with Point Supervision

arXiv:1506.02106v51090 citations
Originality Incremental advance
AI Analysis

This addresses the problem of high annotation costs for training accurate segmentation models, offering a cost-effective solution for computer vision researchers and practitioners.

The paper tackles the trade-off between annotation cost and accuracy in semantic segmentation by using point-level supervision instead of per-pixel or image-level labels, achieving a 12.9% mIOU improvement over image-level supervision on PASCAL VOC 2012.

The semantic image segmentation task presents a trade-off between test time accuracy and training-time annotation cost. Detailed per-pixel annotations enable training accurate models but are very time-consuming to obtain, image-level class labels are an order of magnitude cheaper but result in less accurate models. We take a natural step from image-level annotation towards stronger supervision: we ask annotators to point to an object if one exists. We incorporate this point supervision along with a novel objectness potential in the training loss function of a CNN model. Experimental results on the PASCAL VOC 2012 benchmark reveal that the combined effect of point-level supervision and objectness potential yields an improvement of 12.9% mIOU over image-level supervision. Further, we demonstrate that models trained with point-level supervision are more accurate than models trained with image-level, squiggle-level or full supervision given a fixed annotation budget.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes