DISIR: Deep Image Segmentation with Interactive Refinement
This addresses the problem of efficient and accurate segmentation for aerial image analysis, offering a fast refinement process, though it is incremental in improving interactive methods.
The paper tackles multi-class segmentation of aerial images by introducing an interactive deep learning approach that refines initial segmentations using user annotations, achieving a correction of roughly 5000 pixels per click.
This paper presents an interactive approach for multi-class segmentation of aerial images. Precisely, it is based on a deep neural network which exploits both RGB images and annotations. Starting from an initial output based on the image only, our network then interactively refines this segmentation map using a concatenation of the image and user annotations. Importantly, user annotations modify the inputs of the network - not its weights - enabling a fast and smooth process. Through experiments on two public aerial datasets, we show that user annotations are extremely rewarding: each click corrects roughly 5000 pixels. We analyze the impact of different aspects of our framework such as the representation of the annotations, the volume of training data or the network architecture. Code is available at https://github.com/delair-ai/DISIR.