CVLGAug 29, 2025

SatDINO: A Deep Dive into Self-Supervised Pretraining for Remote Sensing

arXiv:2508.21402v1h-index: 2Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses representation learning for remote sensing, where large unlabeled datasets are common, but it appears incremental as it adapts an existing method to a specific domain.

The authors tackled the problem of self-supervised pretraining for remote sensing imagery by introducing SatDINO, a model based on DINO, which outperformed state-of-the-art methods like masked autoencoders in multiple benchmarks.

Self-supervised learning has emerged as a powerful tool for remote sensing, where large amounts of unlabeled data are available. In this work, we investigate the use of DINO, a contrastive self-supervised method, for pretraining on remote sensing imagery. We introduce SatDINO, a model tailored for representation learning in satellite imagery. Through extensive experiments on multiple datasets in multiple testing setups, we demonstrate that SatDINO outperforms other state-of-the-art methods based on much more common masked autoencoders (MAE) and achieves competitive results in multiple benchmarks. We also provide a rigorous ablation study evaluating SatDINO's individual components. Finally, we propose a few novel enhancements, such as a new way to incorporate ground sample distance (GSD) encoding and adaptive view sampling. These enhancements can be used independently on our SatDINO model. Our code and trained models are available at: https://github.com/strakaj/SatDINO.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes