CVLGIVFeb 16, 2020

Block Annotation: Better Image Annotation for Semantic Segmentation with Sub-Image Decomposition

arXiv:2002.06626v123 citations
Originality Incremental advance
AI Analysis

This addresses the problem of expensive and time-consuming image annotation for researchers and practitioners in computer vision, offering a more efficient alternative that is incremental in improving annotation methods.

The paper tackles the high cost of full-image pixel-level annotation for semantic segmentation by proposing block sub-image annotation, which reduces annotation time and cost while achieving equivalent performance with only 50% of pixels annotated and up to 98% performance with as little as 12% of pixels annotated.

Image datasets with high-quality pixel-level annotations are valuable for semantic segmentation: labelling every pixel in an image ensures that rare classes and small objects are annotated. However, full-image annotations are expensive, with experts spending up to 90 minutes per image. We propose block sub-image annotation as a replacement for full-image annotation. Despite the attention cost of frequent task switching, we find that block annotations can be crowdsourced at higher quality compared to full-image annotation with equal monetary cost using existing annotation tools developed for full-image annotation. Surprisingly, we find that 50% pixels annotated with blocks allows semantic segmentation to achieve equivalent performance to 100% pixels annotated. Furthermore, as little as 12% of pixels annotated allows performance as high as 98% of the performance with dense annotation. In weakly-supervised settings, block annotation outperforms existing methods by 3-4% (absolute) given equivalent annotation time. To recover the necessary global structure for applications such as characterizing spatial context and affordance relationships, we propose an effective method to inpaint block-annotated images with high-quality labels without additional human effort. As such, fewer annotations can also be used for these applications compared to full-image annotation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes