CVAILGOct 17, 2025

Semantic segmentation with coarse annotations

arXiv:2510.15756v1h-index: 1
Originality Incremental advance
AI Analysis

This addresses the challenge of expensive fine annotations for semantic segmentation, offering a practical solution for domains like medical imaging or autonomous driving, though it is incremental as it builds on existing encoder-decoder architectures.

The paper tackles the problem of semantic segmentation using coarse annotations, which are cheaper to obtain but lead to poor boundary alignment. The proposed regularization method improves boundary recall significantly on datasets like SUIM, Cityscapes, and PanNuke compared to state-of-the-art models.

Semantic segmentation is the task of classifying each pixel in an image. Training a segmentation model achieves best results using annotated images, where each pixel is annotated with the corresponding class. When obtaining fine annotations is difficult or expensive, it may be possible to acquire coarse annotations, e.g. by roughly annotating pixels in an images leaving some pixels around the boundaries between classes unlabeled. Segmentation with coarse annotations is difficult, in particular when the objective is to optimize the alignment of boundaries between classes. This paper proposes a regularization method for models with an encoder-decoder architecture with superpixel based upsampling. It encourages the segmented pixels in the decoded image to be SLIC-superpixels, which are based on pixel color and position, independent of the segmentation annotation. The method is applied to FCN-16 fully convolutional network architecture and evaluated on the SUIM, Cityscapes, and PanNuke data sets. It is shown that the boundary recall improves significantly compared to state-of-the-art models when trained on coarse annotations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes