BoxSnake: Polygonal Instance Segmentation with Box Supervision
This addresses the need for cost-effective segmentation in computer vision by reducing annotation requirements, though it is incremental as it builds on existing box-supervised methods.
The paper tackles the problem of polygonal instance segmentation using only box annotations, proposing BoxSnake to reduce the performance gap between predicted segmentation and bounding boxes, achieving significant superiority on the Cityscapes dataset.
Box-supervised instance segmentation has gained much attention as it requires only simple box annotations instead of costly mask or polygon annotations. However, existing box-supervised instance segmentation models mainly focus on mask-based frameworks. We propose a new end-to-end training technique, termed BoxSnake, to achieve effective polygonal instance segmentation using only box annotations for the first time. Our method consists of two loss functions: (1) a point-based unary loss that constrains the bounding box of predicted polygons to achieve coarse-grained segmentation; and (2) a distance-aware pairwise loss that encourages the predicted polygons to fit the object boundaries. Compared with the mask-based weakly-supervised methods, BoxSnake further reduces the performance gap between the predicted segmentation and the bounding box, and shows significant superiority on the Cityscapes dataset. The code has been available publicly.