LGAIMar 8

What on Earth is AlphaEarth? Hierarchical structure and functional interpretability for global land cover

arXiv:2603.16911h-index: 13
Predicted impact top 38% in LG · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses the interpretability gap for geospatial AI models, offering practical guidance for dimension selection to reduce computational costs in operational land cover classification tasks, though it is incremental as it builds on existing interpretability studies.

The researchers tackled the problem of understanding the internal organization of geospatial foundation model embeddings, specifically AlphaEarth, by developing a functional interpretability framework to analyze their hierarchical structure, and found that accurate land cover classification (98% of baseline performance) can be achieved using only 2 to 12 out of 64 dimensions, revealing substantial redundancy.

Geospatial foundation models generate high-dimensional embeddings that achieve strong predictive performance, yet their internal organization remains obscure, limiting their scientific use. Recent interpretability studies relate Google AlphaEarth Foundations (GAEF) embeddings to continuous environmental variables, but it is still unclear whether the embedding space exhibits a functional or hierarchical organization, in which some dimensions act as specialized representations while others encode shared or broader geospatial structure. In this work, we propose a functional interpretability framework that reverse-engineers the role of embedding dimensions by characterizing their contribution to land cover structure from observed classification behavior. The approach combines large-scale experimentation with a structural analysis of embedding-class relationships based on feature importance patterns and progressive ablation. Our results show that embedding dimensions exhibit consistent and non-uniform functional behavior, allowing them to be categorized along a hierarchical functional spectrum: specialist dimensions associated with specific land cover classes, low- and mid-generalist dimensions capturing shared characteristics between classes, and highgeneralist dimensions reflecting broader environmental gradients. Critically, we find that accurate land cover classification (98% of baseline performance) can be achieved using as few as 2 to 12 of the 64 available dimensions, depending on the class. This demonstrates substantial redundancy in the embedding space and offers a pathway toward significant reductions in computational cost. Together, these findings reveal that AlphaEarth embeddings are not only physically informative, but also functionally organized into a hierarchical structure, providing practical guidance for dimension selection in operational classification tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes