Detail Matters: Mamba-Inspired Joint Unfolding Network for Snapshot Spectral Compressive Imaging
This work addresses the nonlinear and ill-posed reconstruction problem in hyperspectral imaging, which is important for applications like remote sensing and medical imaging, but appears incremental as it builds on existing deep unfolding networks with Mamba-inspired modifications.
The paper tackles the challenge of accurately and stably reconstructing 3D hyperspectral images from single 2D measurements in snapshot spectral compressive imaging by proposing MiJUN, a Mamba-inspired joint unfolding network that integrates physics-based and learning-based approaches, achieving superior detail representation in numerical and visual comparisons on simulation and real datasets.
In the coded aperture snapshot spectral imaging system, Deep Unfolding Networks (DUNs) have made impressive progress in recovering 3D hyperspectral images (HSIs) from a single 2D measurement. However, the inherent nonlinear and ill-posed characteristics of HSI reconstruction still pose challenges to existing methods in terms of accuracy and stability. To address this issue, we propose a Mamba-inspired Joint Unfolding Network (MiJUN), which integrates physics-embedded DUNs with learning-based HSI imaging. Firstly, leveraging the concept of trapezoid discretization to expand the representation space of unfolding networks, we introduce an accelerated unfolding network scheme. This approach can be interpreted as a generalized accelerated half-quadratic splitting with a second-order differential equation, which reduces the reliance on initial optimization stages and addresses challenges related to long-range interactions. Crucially, within the Mamba framework, we restructure the Mamba-inspired global-to-local attention mechanism by incorporating a selective state space model and an attention mechanism. This effectively reinterprets Mamba as a variant of the Transformer} architecture, improving its adaptability and efficiency. Furthermore, we refine the scanning strategy with Mamba by integrating the tensor mode-$k$ unfolding into the Mamba network. This approach emphasizes the low-rank properties of tensors along various modes, while conveniently facilitating 12 scanning directions. Numerical and visual comparisons on both simulation and real datasets demonstrate the superiority of our proposed MiJUN, and achieving overwhelming detail representation.