GEO-Bench: Toward Foundation Models for Earth Monitoring
This work addresses the need for standardized evaluation in remote sensing to drive progress in Earth monitoring, though it is incremental as it focuses on benchmarking rather than new methods.
The paper tackles the lack of foundation models for Earth monitoring by proposing GEO-Bench, a benchmark with 12 tasks for classification and segmentation, and reports results for 20 baselines to assess existing model performance.
Recent progress in self-supervision has shown that pre-training large neural networks on vast amounts of unsupervised data can lead to substantial increases in generalization to downstream tasks. Such models, recently coined foundation models, have been transformational to the field of natural language processing. Variants have also been proposed for image data, but their applicability to remote sensing tasks is limited. To stimulate the development of foundation models for Earth monitoring, we propose a benchmark comprised of six classification and six segmentation tasks, which were carefully curated and adapted to be both relevant to the field and well-suited for model evaluation. We accompany this benchmark with a robust methodology for evaluating models and reporting aggregated results to enable a reliable assessment of progress. Finally, we report results for 20 baselines to gain information about the performance of existing models. We believe that this benchmark will be a driver of progress across a variety of Earth monitoring tasks.