SpecTM: Spectral Targeted Masking for Trustworthy Foundation Models
It addresses trustworthiness limitations in predictive models for public health decisions in Earth observation, though it appears incremental as it builds on existing self-supervised learning with targeted modifications.
The paper tackled the problem of foundation models for Earth observation lacking physics constraints by proposing SpecTM, a physics-informed masking design that improved microcystin concentration predictions, achieving R^2 = 0.695 for current week and R^2 = 0.620 for 8-day-ahead, surpassing baselines by up to +99%.
Foundation models are now increasingly being developed for Earth observation (EO), yet they often rely on stochastic masking that do not explicitly enforce physics constraints; a critical trustworthiness limitation, in particular for predictive models that guide public health decisions. In this work, we propose SpecTM (Spectral Targeted Masking), a physics-informed masking design that encourages the reconstruction of targeted bands from cross-spectral context during pretraining. To achieve this, we developed an adaptable multi-task (band reconstruction, bio-optical index inference, and 8-day-ahead temporal prediction) self-supervised learning (SSL) framework that encodes spectrally intrinsic representations via joint optimization, and evaluated it on a downstream microcystin concentration regression model using NASA PACE hyperspectral imagery over Lake Erie. SpecTM achieves R^2 = 0.695 (current week) and R^2 = 0.620 (8-day-ahead) predictions surpassing all baseline models by (+34% (0.51 Ridge) and +99% (SVR 0.31)) respectively. Our ablation experiments show targeted masking improves predictions by +0.037 R^2 over random masking. Furthermore, it outperforms strong baselines with 2.2x superior label efficiency under extreme scarcity. SpecTM enables physics-informed representation learning across EO domains and improves the interpretability of foundation models.