CVNov 16, 2024

SMLNet: A SPD Manifold Learning Network for Infrared and Visible Image Fusion

arXiv:2411.10679v36 citationsh-index: 17Has CodeInt J Comput Vis
Originality Incremental advance
AI Analysis

This addresses image fusion for applications like surveillance or medical imaging, but appears incremental as it adapts existing manifold learning concepts to a specific task.

The paper tackles the problem of infrared and visible image fusion by proposing SMLNet, a method that extends image fusion from Euclidean space to SPD manifolds to better handle non-Euclidean data structures, achieving superior performance compared to state-of-the-art methods on public datasets.

Euclidean representation learning methods have achieved promising results in image fusion tasks, which can be attributed to their clear advantages in handling with linear space. However, data collected from a realistic scene usually has a non-Euclidean structure, evaluating the consistency of latent representations from paired views using Euclidean distance raises challenges. To address this issue, a novel SPD (symmetric positive definite) manifold learning is proposed for multi-modal image fusion, named SMLNet, which extends the image fusion approach from the Euclidean space to the SPD manifolds. Specifically, we encode images according to the Riemannian geometry to exploit their intrinsic statistical correlations, thereby aligning with human visual perception. The SPD matrix fundamentally underpins our network's learning process. Building upon this mathematical foundation, we employ a cross-modal fusion strategy to exploit modality-specific dependencies and augment complementary information. To capture semantic similarity in images' intrinsic space, we further develop an attention module that meticulously processes the cross-modal semantic affinity matrix. Based on this, we design an end-to-end fusion network based on cross-modal manifold learning. Extensive experiments on public datasets demonstrate that our framework exhibits superior performance compared to the current state-of-the-art methods. Our code will be publicly available at https://github.com/Shaoyun2023.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes