LGAIAPMLMay 19

FLUXtrapolation: A benchmark on extrapolating ecosystem fluxes

arXiv:2605.1981253.4
Predicted impact top 47% in LG · last 90 daysOriginality Incremental advance
AI Analysis

This benchmark addresses the need for robust evaluation of machine learning methods for flux upscaling, a critical problem for climate science, by introducing realistic distribution shifts and multi-scale metrics.

FLUXtrapolation is a benchmark for extrapolating ecosystem fluxes under distribution shifts, designed to evaluate models on temporal, spatial, and temperature-based scenarios. A pilot study found that baselines perform similarly under median hourly RMSE but differ under tail-focused and multi-scale evaluation.

We introduce FLUXtrapolation, a benchmark for extrapolating ecosystem fluxes under progressively harder distribution shifts. Ecosystem fluxes are central to understanding the carbon, water, and energy cycles, yet they can only be measured directly at sparsely located measurement towers. Producing global flux estimates therefore requires training models on observed sites using globally available covariates and predicting in unobserved regions, that is, upscaling. Flux upscaling is a challenging domain generalization problem that is affected by a shift in covariate distribution across climates, ecosystem types, and environmental conditions, as well as by conditional shift: important drivers remain unobserved at global scale. We provide a quantitative analysis of both these shifts in $P_X$ and $P_{Y\mid X}$. FLUXtrapolation is designed based on domain expertise on flux upscaling: it defines temporal, spatial, and temperature-based extrapolation scenarios and evaluates performance across held-out domains, temporal aggregations, and tail errors. In a pilot study, we find that baselines perform similarly under median hourly RMSE, but separate under the proposed tail-focused and multi-scale evaluation. FLUXtrapolation therefore poses a realistic and thus relevant challenge for machine learning methods under distribution shift; at the same time, progress on this benchmark would directly support the scientific goal of improving flux upscaling.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes