CVMar 26, 2018

Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++

arXiv:1803.09693v1435 citations
Originality Incremental advance
AI Analysis

This work addresses the bottleneck of dataset annotation for computer vision researchers and practitioners, offering incremental improvements to an existing interactive method.

The paper tackles the time-consuming problem of manually labeling object masks in datasets by introducing Polygon-RNN++, an interactive annotation tool that improves upon Polygon-RNN with a new CNN encoder, reinforcement learning training, and a Graph Neural Network for higher resolution. It shows a 10% absolute and 16% relative improvement in mean IoU on Cityscapes and reduces annotator clicks by 50% in interactive mode.

Manually labeling datasets with object masks is extremely time consuming. In this work, we follow the idea of Polygon-RNN to produce polygonal annotations of objects interactively using humans-in-the-loop. We introduce several important improvements to the model: 1) we design a new CNN encoder architecture, 2) show how to effectively train the model with Reinforcement Learning, and 3) significantly increase the output resolution using a Graph Neural Network, allowing the model to accurately annotate high-resolution objects in images. Extensive evaluation on the Cityscapes dataset shows that our model, which we refer to as Polygon-RNN++, significantly outperforms the original model in both automatic (10% absolute and 16% relative improvement in mean IoU) and interactive modes (requiring 50% fewer clicks by annotators). We further analyze the cross-domain scenario in which our model is trained on one dataset, and used out of the box on datasets from varying domains. The results show that Polygon-RNN++ exhibits powerful generalization capabilities, achieving significant improvements over existing pixel-wise methods. Using simple online fine-tuning we further achieve a high reduction in annotation time for new datasets, moving a step closer towards an interactive annotation tool to be used in practice.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes