DIAL: Deep Interactive and Active Learning for Semantic Segmentation in Remote Sensing
This work addresses the challenge of reducing annotation effort for semantic segmentation in remote sensing, which is incremental as it combines existing interactive and active learning techniques.
The authors tackled the problem of obtaining accurate semantic segmentation maps for remote sensing images by proposing a deep interactive and active learning framework that integrates human corrections and guides annotations to relevant areas, resulting in improved efficiency and effectiveness across three datasets.
We propose in this article to build up a collaboration between a deep neural network and a human in the loop to swiftly obtain accurate segmentation maps of remote sensing images. In a nutshell, the agent iteratively interacts with the network to correct its initially flawed predictions. Concretely, these interactions are annotations representing the semantic labels. Our methodological contribution is twofold. First, we propose two interactive learning schemes to integrate user inputs into deep neural networks. The first one concatenates the annotations with the other network's inputs. The second one uses the annotations as a sparse ground-truth to retrain the network. Second, we propose an active learning strategy to guide the user towards the most relevant areas to annotate. To this purpose, we compare different state-of-the-art acquisition functions to evaluate the neural network uncertainty such as ConfidNet, entropy or ODIN. Through experiments on three remote sensing datasets, we show the effectiveness of the proposed methods. Notably, we show that active learning based on uncertainty estimation enables to quickly lead the user towards mistakes and that it is thus relevant to guide the user interventions.