CVNov 16, 2024

SMLNet: A SPD Manifold Learning Network for Infrared and Visible Image Fusion

Huan Kang, Hui Li, Tianyang Xu, Xiao-Jun Wu, Rui Wang, Chunyang Cheng, Josef Kittler

arXiv:2411.10679v38.76 citationsh-index: 17Has CodeInt J Comput Vis

Originality Incremental advance

AI Analysis

This addresses image fusion for applications like surveillance or medical imaging, but appears incremental as it adapts existing manifold learning concepts to a specific task.

The paper tackles the problem of infrared and visible image fusion by proposing SMLNet, a method that extends image fusion from Euclidean space to SPD manifolds to better handle non-Euclidean data structures, achieving superior performance compared to state-of-the-art methods on public datasets.

Euclidean representation learning methods have achieved promising results in image fusion tasks, which can be attributed to their clear advantages in handling with linear space. However, data collected from a realistic scene usually has a non-Euclidean structure, evaluating the consistency of latent representations from paired views using Euclidean distance raises challenges. To address this issue, a novel SPD (symmetric positive definite) manifold learning is proposed for multi-modal image fusion, named SMLNet, which extends the image fusion approach from the Euclidean space to the SPD manifolds. Specifically, we encode images according to the Riemannian geometry to exploit their intrinsic statistical correlations, thereby aligning with human visual perception. The SPD matrix fundamentally underpins our network's learning process. Building upon this mathematical foundation, we employ a cross-modal fusion strategy to exploit modality-specific dependencies and augment complementary information. To capture semantic similarity in images' intrinsic space, we further develop an attention module that meticulously processes the cross-modal semantic affinity matrix. Based on this, we design an end-to-end fusion network based on cross-modal manifold learning. Extensive experiments on public datasets demonstrate that our framework exhibits superior performance compared to the current state-of-the-art methods. Our code will be publicly available at https://github.com/Shaoyun2023.

View on arXiv PDF

Similar