Earth Embeddings as Products: Taxonomy, Ecosystem, and Standardized Access
This addresses an engineering bottleneck for researchers and practitioners in Earth observation, enabling more transparent and accessible workflows, though it is incremental as it builds on existing tools.
The paper tackles the problem of fragmented and incompatible pre-computed geospatial embedding products, which hinder model comparison and reproducibility, by introducing a unified API in TorchGeo to standardize access and loading, thereby decoupling downstream analysis from model-specific engineering.
Geospatial Foundation Models (GFMs) provide powerful representations, but high compute costs hinder their widespread use. Pre-computed embedding data products offer a practical "frozen" alternative, yet they currently exist in a fragmented ecosystem of incompatible formats and resolutions. This lack of standardization creates an engineering bottleneck that prevents meaningful model comparison and reproducibility. We formalize this landscape through a three-layer taxonomy: Data, Tools, and Value. We survey existing products to identify interoperability barriers. To bridge this gap, we extend TorchGeo with a unified API that standardizes the loading and querying of diverse embedding products. By treating embeddings as first-class geospatial datasets, we decouple downstream analysis from model-specific engineering, providing a roadmap for more transparent and accessible Earth observation workflows.