CVIVAug 1, 2024

Cross-Scan Mamba with Masked Training for Robust Spectral Imaging

arXiv:2408.00629v2h-index: 24
Originality Incremental advance
AI Analysis

This addresses the problem of robust spectral imaging for applications like remote sensing or medical imaging, but it is incremental as it builds on existing Mamba models and training techniques.

The paper tackles hyperspectral image reconstruction from compressed measurements by proposing CS-Mamba for better modeling of spatial-spectral correlations and a masked training method to improve generalization to real data, achieving state-of-the-art performance with enhanced visual quality.

Snapshot Compressive Imaging (SCI) enables fast spectral imaging but requires effective decoding algorithms for hyperspectral image (HSI) reconstruction from compressed measurements. Current CNN-based methods are limited in modeling long-range dependencies, while Transformer-based models face high computational complexity. Although recent Mamba models outperform CNNs and Transformers in RGB tasks concerning computational efficiency or accuracy, they are not specifically optimized to fully leverage the local spatial and spectral correlations inherent in HSIs. To address this, we propose the Cross-Scanning Mamba, named CS-Mamba, that employs a Spatial-Spectral SSM for global-local balanced context encoding and cross-channel interaction promotion. Besides, while current reconstruction algorithms perform increasingly well in simulation scenarios, they exhibit suboptimal performance on real data due to limited generalization capability. During the training process, the model may not capture the inherent features of the images but rather learn the parameters to mitigate specific noise and loss, which may lead to a decline in reconstruction quality when faced with real scenes. To overcome this challenge, we propose a masked training method to enhance the generalization ability of models. Experiment results show that our CS-Mamba achieves state-of-the-art performance and the masked training method can better reconstruct smooth features to improve the visual quality.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes