Knowledge-Guided Masked Autoencoder with Linear Spectral Mixing and Spectral-Angle-Aware Reconstruction
This work addresses the challenge of enhancing deep learning models with domain knowledge for better interpretability and data efficiency, particularly in scientific domains, though it appears incremental as it builds on existing transformer-based self-supervised methods.
The paper tackled the problem of integrating domain knowledge into deep learning for improved model interpretability and generalization by proposing a knowledge-guided ViT-based Masked Autoencoder that incorporates the Linear Spectral Mixing Model and Spectral Angle Mapper as physical constraints. The result was enhanced reconstruction quality and improved downstream task performance, demonstrating the effectiveness of embedding physics-informed inductive biases in self-supervised learning.
Integrating domain knowledge into deep learning has emerged as a promising direction for improving model interpretability, generalization, and data efficiency. In this work, we present a novel knowledge-guided ViT-based Masked Autoencoder that embeds scientific domain knowledge within the self-supervised reconstruction process. Instead of relying solely on data-driven optimization, our proposed approach incorporates the Linear Spectral Mixing Model (LSMM) as a physical constraint and physically-based Spectral Angle Mapper (SAM), ensuring that learned representations adhere to known structural relationships between observed signals and their latent components. The framework jointly optimizes LSMM and SAM loss with a conventional Huber loss objective, promoting both numerical accuracy and geometric consistency in the feature space. This knowledge-guided design enhances reconstruction fidelity, stabilizes training under limited supervision, and yields interpretable latent representations grounded in physical principles. The experimental findings indicate that the proposed model substantially enhances reconstruction quality and improves downstream task performance, highlighting the promise of embedding physics-informed inductive biases within transformer-based self-supervised learning.