CVDec 29, 2025

Scaling Remote Sensing Foundation Models: Data Domain Tradeoffs at the Peta-Scale

arXiv:2512.23903v2h-index: 2
AI Analysis

This work addresses the problem of optimizing data and compute for remote sensing AI, which is incremental as it applies known scaling laws to a new domain.

The study tackled the challenge of scaling foundation models for remote sensing by training vision transformers on over a quadrillion pixels of satellite data, finding that performance remained data-limited rather than model-limited at peta-scale.

We explore the scaling behaviors of artificial intelligence to establish practical techniques for training foundation models on high-resolution electro-optical (EO) datasets that exceed the current state-of-the-art scale by orders of magnitude. Modern multimodal machine learning (ML) applications, such as generative artificial intelligence (GenAI) systems for image captioning, search, and reasoning, depend on robust, domain-specialized encoders for non-text modalities. In natural image domains where internet-scale data is plentiful, well-established scaling laws help optimize the joint scaling of model capacity, training compute, and dataset size. Unfortunately, these relationships are much less well understood in high-value domains like remote sensing (RS). Using over a quadrillion pixels of commercial satellite EO data and MITRE's Federal AI Sandbox, we train progressively larger vision transformer (ViT) backbones, report successes and failure modes observed at peta-scale, and analyze implications for bridging domain gaps across additional RS modalities. We observe that even at this scale, performance is consistent with a data-limited regime rather than a model parameter-limited one. These practical insights are intended to inform data collection strategies, compute budgets, and optimization schedules that advance the future development of frontier scale RS foundation models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes