CVJul 17, 2024

Mutual Information Guided Optimal Transport for Unsupervised Visible-Infrared Person Re-identification

arXiv:2407.12758v14 citationsh-index: 23
Originality Incremental advance
AI Analysis

It addresses the problem of cross-modality retrieval in surveillance for computer vision, with incremental improvements in unsupervised learning.

The paper tackles unsupervised visible-infrared person re-identification by proposing a method based on mutual information and optimal transport to learn modality-invariant features without labels, achieving 60.6% and 90.3% Rank-1 accuracy on SYSU-MM01 and RegDB benchmarks.

Unsupervised visible infrared person re-identification (USVI-ReID) is a challenging retrieval task that aims to retrieve cross-modality pedestrian images without using any label information. In this task, the large cross-modality variance makes it difficult to generate reliable cross-modality labels, and the lack of annotations also provides additional difficulties for learning modality-invariant features. In this paper, we first deduce an optimization objective for unsupervised VI-ReID based on the mutual information between the model's cross-modality input and output. With equivalent derivation, three learning principles, i.e., "Sharpness" (entropy minimization), "Fairness" (uniform label distribution), and "Fitness" (reliable cross-modality matching) are obtained. Under their guidance, we design a loop iterative training strategy alternating between model training and cross-modality matching. In the matching stage, a uniform prior guided optimal transport assignment ("Fitness", "Fairness") is proposed to select matched visible and infrared prototypes. In the training stage, we utilize this matching information to introduce prototype-based contrastive learning for minimizing the intra- and cross-modality entropy ("Sharpness"). Extensive experimental results on benchmarks demonstrate the effectiveness of our method, e.g., 60.6% and 90.3% of Rank-1 accuracy on SYSU-MM01 and RegDB without any annotations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes