CVDec 21, 2024

IV-tuning: Parameter-Efficient Transfer Learning for Infrared-Visible Tasks

Yaming Zhang, Chenqiang Gao, Fangcen Liu, Junjie Guo, Lan Wang, Xinggan Peng, Deyu Meng

arXiv:2412.16654v43.71 citationsh-index: 10Has Code

Originality Incremental advance

AI Analysis

This addresses parameter efficiency and overfitting in multimodal IR-VIS tasks like object detection and segmentation, but is incremental as it builds on existing parameter-efficient tuning methods.

The paper tackles the problem of constrained feature spaces in infrared-visible (IR-VIS) tasks under full fine-tuning, proposing IV-tuning to parameter-efficiently harness pre-trained visual models with less than 3% of backbone parameters, effectively alleviating overfitting and improving generalization.

Existing infrared and visible (IR-VIS) methods inherit the general representations of Pre-trained Visual Models (PVMs) to facilitate complementary learning. However, our analysis indicates that under the full fine-tuning paradigm, the feature space becomes highly constrained and low-ranked, which has been proven to seriously impair generalization. One solution is freezing parameters to preserve pre-trained knowledge and thus maintain diversity of the feature space. To this end, we propose IV-tuning, to parameter-efficiently harness PVMs for various IR-VIS downstream tasks, including salient object detection, semantic segmentation, and object detection. Compared with the full fine-tuning baselines and existing IR-VIS methods, IV-tuning facilitates the learning of complementary information between infrared and visible modalities with less than 3% of the backbone parameters, and effectively alleviates the overfitting problem. The code is available in https://github.com/Yummy198913/IV-tuning.

View on arXiv PDF Code

Similar