CVNov 21, 2025

Latent Dirichlet Transformer VAE for Hyperspectral Unmixing with Bundled Endmembers

arXiv:2511.17757v1
Originality Incremental advance
AI Analysis

This work addresses spectral mixing in hyperspectral imaging for material identification, offering a novel method that is incremental in combining transformers with physical constraints.

The paper tackles hyperspectral unmixing by proposing a Latent Dirichlet Transformer VAE that models materials as bundled endmembers, resulting in improved abundance estimation and endmember extraction with better performance on benchmark datasets.

Hyperspectral images capture rich spectral information that enables per-pixel material identification; however, spectral mixing often obscures pure material signatures. To address this challenge, we propose the Latent Dirichlet Transformer Variational Autoencoder (LDVAE-T) for hyperspectral unmixing. Our model combines the global context modeling capabilities of transformer architectures with physically meaningful constraints imposed by a Dirichlet prior in the latent space. This prior naturally enforces the sum-to-one and non-negativity conditions essential for abundance estimation, thereby improving the quality of predicted mixing ratios. A key contribution of LDVAE-T is its treatment of materials as bundled endmembers, rather than relying on fixed ground truth spectra. In the proposed method our decoder predicts, for each endmember and each patch, a mean spectrum together with a structured (segmentwise) covariance that captures correlated spectral variability. Reconstructions are formed by mixing these learned bundles with Dirichlet-distributed abundances garnered from a transformer encoder, allowing the model to represent intrinsic material variability while preserving physical interpretability. We evaluate our approach on three benchmark datasets, Samson, Jasper Ridge, and HYDICE Urban and show that LDVAE-T consistently outperforms state-of-the-art models in abundance estimation and endmember extraction, as measured by root mean squared error and spectral angle distance, respectively.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes