SSL4EO-S12 v1.1: A Multimodal, Multiseasonal Dataset for Pretraining, Updated
This dataset facilitates open research in self-supervised learning and geospatial analysis for Earth Observation applications, though it is incremental as an update to an existing dataset.
The paper introduces SSL4EO-S12 v1.1, an updated multimodal and multiseasonal Earth Observation dataset designed for pretraining foundation models, addressing previous issues like data misalignment and limited structure to provide nearly one million patches covering 10,000 cities across four seasons.
This technical report presents SSL4EO-S12 v1.1, a multimodal, multitemporal Earth Observation dataset designed for pretraining large-scale foundation models. Building on the success of SSL4EO-S12 v1.0, the new version addresses the previous challenges of data misalignment and a limited data structure for low-barrier, analysis-ready EO processing. SSL4EO-S12 v1.1 covers the world's 10,000 largest cities and its surroundings within a 50 km radius across four seasons, resulting in a diverse collection of nearly one million patches. SSL4EO-S12 v1.1 packages the data in Zarr file format for cloud-efficient loading and representation of meta-information such as including cloud masks and geolocation. Released under the CC-BY-4.0 license, SSL4EO-S12 v1.1 facilitates open research and provides a robust foundation for future advancements in self-supervised learning and geospatial analysis. The dataset is available online through https://datapub.fz-juelich.de/ssl4eo-s12, and we provided additional resources at https://github.com/DLR-MF-DAS/SSL4EO-S12-v1.1.