CVJan 3, 2024

Frequency Domain Modality-invariant Feature Learning for Visible-infrared Person Re-Identification

arXiv:2401.01839v28 citationsh-index: 31
AI Analysis

This work addresses cross-modality discrepancies in person re-identification for surveillance applications, offering a novel approach but with incremental improvements over existing methods.

The paper tackles the problem of visible-infrared person re-identification by identifying amplitude differences as the primary cause of modality discrepancy and proposes FDMNet, a frequency domain framework that achieves superior performance with mAP of 72.3% and rank-1 accuracy of 75.8% on SYSU-MM01.

Visible-infrared person re-identification (VI-ReID) is challenging due to the significant cross-modality discrepancies between visible and infrared images. While existing methods have focused on designing complex network architectures or using metric learning constraints to learn modality-invariant features, they often overlook which specific component of the image causes the modality discrepancy problem. In this paper, we first reveal that the difference in the amplitude component of visible and infrared images is the primary factor that causes the modality discrepancy and further propose a novel Frequency Domain modality-invariant feature learning framework (FDMNet) to reduce modality discrepancy from the frequency domain perspective. Our framework introduces two novel modules, namely the Instance-Adaptive Amplitude Filter (IAF) module and the Phrase-Preserving Normalization (PPNorm) module, to enhance the modality-invariant amplitude component and suppress the modality-specific component at both the image- and feature-levels. Extensive experimental results on two standard benchmarks, SYSU-MM01 and RegDB, demonstrate the superior performance of our FDMNet against state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes