CVIVJun 17, 2024

HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model

arXiv:2406.11519v2162 citationsHas Code
AI Analysis

This addresses the need for more practical and transferable hyperspectral intelligence in applications like urban planning and precision agriculture, representing a novel method rather than an incremental improvement.

The paper tackles the problem of hyperspectral image interpretation being task-specific and scene-dependent by introducing HyperSIGMA, a vision transformer-based foundation model that unifies tasks and scenes, achieving superior performance across various high-level and low-level tasks compared to state-of-the-art methods.

Accurate hyperspectral image (HSI) interpretation is critical for providing valuable insights into various earth observation-related applications such as urban planning, precision agriculture, and environmental monitoring. However, existing HSI processing methods are predominantly task-specific and scene-dependent, which severely limits their ability to transfer knowledge across tasks and scenes, thereby reducing the practicality in real-world applications. To address these challenges, we present HyperSIGMA, a vision transformer-based foundation model that unifies HSI interpretation across tasks and scenes, scalable to over one billion parameters. To overcome the spectral and spatial redundancy inherent in HSIs, we introduce a novel sparse sampling attention (SSA) mechanism, which effectively promotes the learning of diverse contextual features and serves as the basic block of HyperSIGMA. HyperSIGMA integrates spatial and spectral features using a specially designed spectral enhancement module. In addition, we construct a large-scale hyperspectral dataset, HyperGlobal-450K, for pre-training, which contains about 450K hyperspectral images, significantly surpassing existing datasets in scale. Extensive experiments on various high-level and low-level HSI tasks demonstrate HyperSIGMA's versatility and superior representational capability compared to current state-of-the-art methods. Moreover, HyperSIGMA shows significant advantages in scalability, robustness, cross-modal transferring capability, real-world applicability, and computational efficiency. The code and models will be released at https://github.com/WHU-Sigma/HyperSIGMA.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes