On-Demand Earth System Data Cubes
This work addresses the need for streamlined data preparation in Earth system science, enabling faster creation of task-specific training data for AI-driven tasks, though it is incremental as it builds on existing ESDC and STAC frameworks.
The paper tackles the challenge of efficiently generating Earth System Data Cubes (ESDCs) for AI applications by introducing cubo, an open-source Python tool that automates this process with minimal user input, requiring only coordinates, resolution, edge size, and time range.
Advancements in Earth system science have seen a surge in diverse datasets. Earth System Data Cubes (ESDCs) have been introduced to efficiently handle this influx of high-dimensional data. ESDCs offer a structured, intuitive framework for data analysis, organising information within spatio-temporal grids. The structured nature of ESDCs unlocks significant opportunities for Artificial Intelligence (AI) applications. By providing well-organised data, ESDCs are ideally suited for a wide range of sophisticated AI-driven tasks. An automated framework for creating AI-focused ESDCs with minimal user input could significantly accelerate the generation of task-specific training data. Here we introduce cubo, an open-source Python tool designed for easy generation of AI-focused ESDCs. Utilising collections in SpatioTemporal Asset Catalogs (STAC) that are stored as Cloud Optimised GeoTIFFs (COGs), cubo efficiently creates ESDCs, requiring only central coordinates, spatial resolution, edge size, and time range.