MDReID: Modality-Decoupled Learning for Any-to-Any Multi-Modal Object Re-Identification
This addresses the robustness and scalability issue in real-world ReID systems for applications like surveillance, where queries and gallery images often come from different sensors, though it is incremental as it builds on existing multi-modal ReID methods.
The paper tackles the problem of modality inconsistencies in object re-identification (ReID) by proposing MDReID, a framework for any-to-any image-level ReID that works in both modality-matched and mismatched scenarios, achieving significant mAP improvements such as 9.8%, 3.0%, and 11.5% in modality-matched cases and average gains of 3.4%, 11.8%, and 10.9% in mismatched cases on three benchmarks.
Real-world object re-identification (ReID) systems often face modality inconsistencies, where query and gallery images come from different sensors (e.g., RGB, NIR, TIR). However, most existing methods assume modality-matched conditions, which limits their robustness and scalability in practical applications. To address this challenge, we propose MDReID, a flexible any-to-any image-level ReID framework designed to operate under both modality-matched and modality-mismatched scenarios. MDReID builds on the insight that modality information can be decomposed into two components: modality-shared features that are predictable and transferable, and modality-specific features that capture unique, modality-dependent characteristics. To effectively leverage this, MDReID introduces two key components: the Modality Decoupling Learning (MDL) and Modality-aware Metric Learning (MML). Specifically, MDL explicitly decomposes modality features into modality-shared and modality-specific representations, enabling effective retrieval in both modality-aligned and mismatched scenarios. MML, a tailored metric learning strategy, further enforces orthogonality and complementarity between the two components to enhance discriminative power across modalities. Extensive experiments conducted on three challenging multi-modality ReID benchmarks (RGBNT201, RGBNT100, MSVR310) consistently demonstrate the superiority of MDReID. Notably, MDReID achieves significant mAP improvements of 9.8\%, 3.0\%, and 11.5\% in general modality-matched scenarios, and average gains of 3.4\%, 11.8\%, and 10.9\% in modality-mismatched scenarios, respectively. The code is available at: \textcolor{magenta}{https://github.com/stone96123/MDReID}.