CVMar 3, 2022

Cross-Modality Earth Mover's Distance for Visible Thermal Person Re-Identification

arXiv:2203.01675v140 citationsh-index: 101
Originality Incremental advance
AI Analysis

This work improves person re-identification across visible and thermal modalities, which is important for surveillance and security applications, but it is incremental as it builds on existing distribution alignment methods.

The paper tackles the problem of visible thermal person re-identification by addressing inter-modality discrepancy and intra-identity variations, proposing Cross-Modality Earth Mover's Distance (CM-EMD) with auxiliary techniques to achieve state-of-the-art performance on two benchmarks.

Visible thermal person re-identification (VT-ReID) suffers from the inter-modality discrepancy and intra-identity variations. Distribution alignment is a popular solution for VT-ReID, which, however, is usually restricted to the influence of the intra-identity variations. In this paper, we propose the Cross-Modality Earth Mover's Distance (CM-EMD) that can alleviate the impact of the intra-identity variations during modality alignment. CM-EMD selects an optimal transport strategy and assigns high weights to pairs that have a smaller intra-identity variation. In this manner, the model will focus on reducing the inter-modality discrepancy while paying less attention to intra-identity variations, leading to a more effective modality alignment. Moreover, we introduce two techniques to improve the advantage of CM-EMD. First, the Cross-Modality Discrimination Learning (CM-DL) is designed to overcome the discrimination degradation problem caused by modality alignment. By reducing the ratio between intra-identity and inter-identity variances, CM-DL leads the model to learn more discriminative representations. Second, we construct the Multi-Granularity Structure (MGS), enabling us to align modalities from both coarse- and fine-grained levels with the proposed CM-EMD. Extensive experiments show the benefits of the proposed CM-EMD and its auxiliary techniques (CM-DL and MGS). Our method achieves state-of-the-art performance on two VT-ReID benchmarks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes