Satellite Image Time Series Classification with Pixel-Set Encoders and Temporal Self-Attention
This addresses automated large-scale agricultural parcel control, an issue of political and economic importance, with incremental improvements in method efficiency.
The paper tackles satellite image time series classification for Earth monitoring, particularly in agriculture, by replacing convolutional layers with pixel-set encoders and using self-attention for temporal features, resulting in outperforming previous state-of-the-art in precision while reducing processing time and memory requirements.
Satellite image time series, bolstered by their growing availability, are at the forefront of an extensive effort towards automated Earth monitoring by international institutions. In particular, large-scale control of agricultural parcels is an issue of major political and economic importance. In this regard, hybrid convolutional-recurrent neural architectures have shown promising results for the automated classification of satellite image time series.We propose an alternative approach in which the convolutional layers are advantageously replaced with encoders operating on unordered sets of pixels to exploit the typically coarse resolution of publicly available satellite images. We also propose to extract temporal features using a bespoke neural architecture based on self-attention instead of recurrent networks. We demonstrate experimentally that our method not only outperforms previous state-of-the-art approaches in terms of precision, but also significantly decreases processing time and memory requirements. Lastly, we release a large open-access annotated dataset as a benchmark for future work on satellite image time series.