LGNov 4, 2025

GeoCrossBench: Cross-Band Generalization for Remote Sensing

Hakob Tamazyan, Ani Vanyan, Alvard Barseghyan, Anna Khosrovyan, Evan Shelhamer, Hrant Khachatrian

arXiv:2511.02831v1h-index: 8

Originality Incremental advance

AI Analysis

This addresses the costly retraining challenge for Earth observation as satellite diversity grows, though it's an incremental benchmark extension with a model adaptation.

The authors tackled the problem of remote sensing model generalization across different satellites with varying spectral bands, showing that even state-of-the-art foundation models suffer 2-4x performance drops when generalizing to satellites with no band overlap, while their proposed ChiViT model outperforms alternatives like DINOv3 in this scenario.

The number and diversity of remote sensing satellites grows over time, while the vast majority of labeled data comes from older satellites. As the foundation models for Earth observation scale up, the cost of (re-)training to support new satellites grows too, so the generalization capabilities of the models towards new satellites become increasingly important. In this work we introduce GeoCrossBench, an extension of the popular GeoBench benchmark with a new evaluation protocol: it tests the in-distribution performance; generalization to satellites with no band overlap; and generalization to satellites with additional bands with respect to the training set. We also develop a self-supervised extension of ChannelViT, ChiViT, to improve its cross-satellite performance. First, we show that even the best foundation models for remote sensing (DOFA, TerraFM) do not outperform general purpose models like DINOv3 in the in-distribution setting. Second, when generalizing to new satellites with no band overlap, all models suffer 2-4x drop in performance, and ChiViT significantly outperforms the runner-up DINOv3. Third, the performance of all tested models drops on average by 5-25\% when given additional bands during test time. Finally, we show that fine-tuning just the last linear layer of these models using oracle labels from all bands can get relatively consistent performance across all satellites, highlighting that the benchmark is far from being saturated. We publicly release the code and the datasets to encourage the development of more future-proof remote sensing models with stronger cross-satellite generalization.

View on arXiv PDF

Similar