CVFeb 27, 2025

One Model for ALL: Low-Level Task Interaction Is a Key to Task-Agnostic Image Fusion

arXiv:2502.19854v252 citationsh-index: 17Has CodeCVPR
Originality Incremental advance
AI Analysis

This addresses the challenge of task-agnostic image fusion for broader applicability in computer vision, though it appears incremental in its hybrid approach.

The paper tackles the problem of multimodal image fusion by leveraging low-level vision tasks from digital photography to enable effective feature interaction through pixel-level supervision, achieving high performance across both seen and unseen scenarios with a single model.

Advanced image fusion methods mostly prioritise high-level missions, where task interaction struggles with semantic gaps, requiring complex bridging mechanisms. In contrast, we propose to leverage low-level vision tasks from digital photography fusion, allowing for effective feature interaction through pixel-level supervision. This new paradigm provides strong guidance for unsupervised multimodal fusion without relying on abstract semantics, enhancing task-shared feature learning for broader applicability. Owning to the hybrid image features and enhanced universal representations, the proposed GIFNet supports diverse fusion tasks, achieving high performance across both seen and unseen scenarios with a single model. Uniquely, experimental results reveal that our framework also supports single-modality enhancement, offering superior flexibility for practical applications. Our code will be available at https://github.com/AWCXV/GIFNet.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes