Dynamic Enhancement Network for Partial Multi-modality Person Re-identification
It addresses a practical issue in real-world multi-modality systems where modalities like RGB, NIR, and TIR may be incomplete, which is important for applications like surveillance but often overlooked in prior work.
The paper tackles the problem of missing arbitrary modalities in multi-modality person re-identification by proposing a dynamic enhancement network (DENet) that recovers missing information and adaptively enhances features, achieving state-of-the-art results on datasets like RGBNT201 and RGBNT100.
Many existing multi-modality studies are based on the assumption of modality integrity. However, the problem of missing arbitrary modalities is very common in real life, and this problem is less studied, but actually important in the task of multi-modality person re-identification (Re-ID). To this end, we design a novel dynamic enhancement network (DENet), which allows missing arbitrary modalities while maintaining the representation ability of multiple modalities, for partial multi-modality person Re-ID. To be specific, the multi-modal representation of the RGB, near-infrared (NIR) and thermal-infrared (TIR) images is learned by three branches, in which the information of missing modalities is recovered by the feature transformation module. Since the missing state might be changeable, we design a dynamic enhancement module, which dynamically enhances modality features according to the missing state in an adaptive manner, to improve the multi-modality representation. Extensive experiments on multi-modality person Re-ID dataset RGBNT201 and vehicle Re-ID dataset RGBNT100 comparing to the state-of-the-art methods verify the effectiveness of our method in complex and changeable environments.