The color out of space: learning self-supervised representations for Earth Observation imagery
This work addresses the domain gap problem in remote sensing for researchers and practitioners, offering an incremental improvement over fine-tuning from ImageNet.
The paper tackles the lack of large annotated datasets for Earth Observation imagery by proposing a self-supervised colorization pretext task to learn representations from satellite data, showing improved performance on land cover classification and West Nile Virus detection tasks, with an ensemble model outperforming existing methods.
The recent growth in the number of satellite images fosters the development of effective deep-learning techniques for Remote Sensing (RS). However, their full potential is untapped due to the lack of large annotated datasets. Such a problem is usually countered by fine-tuning a feature extractor that is previously trained on the ImageNet dataset. Unfortunately, the domain of natural images differs from the RS one, which hinders the final performance. In this work, we propose to learn meaningful representations from satellite imagery, leveraging its high-dimensionality spectral bands to reconstruct the visible colors. We conduct experiments on land cover classification (BigEarthNet) and West Nile Virus detection, showing that colorization is a solid pretext task for training a feature extractor. Furthermore, we qualitatively observe that guesses based on natural images and colorization rely on different parts of the input. This paves the way to an ensemble model that eventually outperforms both the above-mentioned techniques.