PAT++: a cautionary tale about generative visual augmentation for Object Re-identification
This work highlights limitations in applying generative models to fine-grained recognition tasks, cautioning against assumptions of transferability in identity-preserving applications.
The study investigated the effectiveness of generative data augmentation for object re-identification, finding that it consistently degraded performance due to domain shifts and loss of identity-defining features.
Generative data augmentation has demonstrated gains in several vision tasks, but its impact on object re-identification - where preserving fine-grained visual details is essential - remains largely unexplored. In this work, we assess the effectiveness of identity-preserving image generation for object re-identification. Our novel pipeline, named PAT++, incorporates Diffusion Self-Distillation into the well-established Part-Aware Transformer. Using the Urban Elements ReID Challenge dataset, we conduct extensive experiments with generated images used for both model training and query expansion. Our results show consistent performance degradation, driven by domain shifts and failure to retain identity-defining features. These findings challenge assumptions about the transferability of generative models to fine-grained recognition tasks and expose key limitations in current approaches to visual augmentation for identity-preserving applications.