CVJun 16, 2025

Evolution of ReID: From Early Methods to LLM Integration

Amran Bhuiyan, Mizanur Rahman, Md Tahmid Rahman Laskar, Aijun An, Jimmy Xiangji Huang

arXiv:2506.13039v13.61 citationsh-index: 20

Originality Synthesis-oriented

AI Analysis

This provides a comprehensive review for researchers in computer vision and NLP, but is incremental as it surveys existing approaches.

This survey traces the evolution of person re-identification (ReID) from early methods to deep learning and recent integration with large language models (LLMs), highlighting that LLM-generated textual descriptions improve accuracy in complex cases.

Person re-identification (ReID) has evolved from handcrafted feature-based methods to deep learning approaches and, more recently, to models incorporating large language models (LLMs). Early methods struggled with variations in lighting, pose, and viewpoint, but deep learning addressed these issues by learning robust visual features. Building on this, LLMs now enable ReID systems to integrate semantic and contextual information through natural language. This survey traces that full evolution and offers one of the first comprehensive reviews of ReID approaches that leverage LLMs, where textual descriptions are used as privileged information to improve visual matching. A key contribution is the use of dynamic, identity-specific prompts generated by GPT-4o, which enhance the alignment between images and text in vision-language ReID systems. Experimental results show that these descriptions improve accuracy, especially in complex or ambiguous cases. To support further research, we release a large set of GPT-4o-generated descriptions for standard ReID datasets. By bridging computer vision and natural language processing, this survey offers a unified perspective on the field's development and outlines key future directions such as better prompt design, cross-modal transfer learning, and real-world adaptability.

View on arXiv PDF

Similar