CVApr 15, 2025

TerraMesh: A Planetary Mosaic of Multimodal Earth Observation Data

arXiv:2504.11172v212 citationsh-index: 19Has Code2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
Originality Synthesis-oriented
AI Analysis

This dataset addresses the need for large-scale, multimodal Earth Observation data for researchers and practitioners in remote sensing and AI, though it is incremental as it builds on existing data collection efforts.

The authors tackled the problem of limited scale, geographic coverage, and sensor variety in public Earth Observation datasets by introducing TerraMesh, a globally diverse, multimodal dataset with over 9 million samples and eight aligned modalities, which demonstrated improved model performance in pre-training.

Large-scale foundation models in Earth Observation can learn versatile, label-efficient representations by leveraging massive amounts of unlabeled data. However, existing public datasets are often limited in scale, geographic coverage, or sensor variety. We introduce TerraMesh, a new globally diverse, multimodal dataset combining optical, synthetic aperture radar, elevation, and land-cover modalities in an Analysis-Ready Data format. TerraMesh includes over 9~million samples with eight spatiotemporal aligned modalities, enabling large-scale pre-training. We provide detailed data processing steps, comprehensive statistics, and empirical evidence demonstrating improved model performance when pre-trained on TerraMesh. The dataset is hosted at https://huggingface.co/datasets/ibm-esa-geospatial/TerraMesh.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes