Cross-Domain Transfer of Hyperspectral Foundation Models
For practitioners of hyperspectral semantic segmentation in proximal sensing, this work offers a simpler and effective alternative to cross-modality transfer, enabling better use of limited data.
The paper proposes cross-domain transfer of hyperspectral foundation models from remote sensing to proximal sensing, achieving large performance improvements over in-domain training and maintaining strong performance with limited data, while reducing the gap to cross-modality approaches.
Hyperspectral imaging (HSI) semantic segmentation typically relies on in-domain training, but limited data availability often restricts model performance in real-world applications. Current approaches to leverage foundation models in proximal sensing use cross-modality techniques, bridging RGB and HSI to exploit vision foundation models. However, these methods either discard spectral information or introduce architectural complexity. We propose cross-domain transfer as an alternative, reusing HSI foundation models - originally trained in remote sensing - for proximal sensing applications. By eliminating the need to bridge modality gaps, our approach preserves spectral information while maintaining a simple architecture. Using the HS3-Bench benchmark, we systematically evaluate and compare conventional in-domain, in-modality training, cross-modality transfer and cross-domain transfer strategies. Our results demonstrate that cross-domain transfer achieves large performance improvements over in-domain, in-modality training, reduces the performance gap to cross-modality approaches and maintains strong performance in limited data settings. Thus, this work advances more effective HSI semantic segmentation in diverse applications.