CVMar 31, 2025

LATex: Leveraging Attribute-based Text Knowledge for Aerial-Ground Person Re-Identification

arXiv:2503.23722v37 citationsh-index: 13Has Code
Originality Incremental advance
AI Analysis

This work addresses AG-ReID for intelligent transportation systems, offering a novel approach to leverage semantic attributes, but it is incremental as it builds on existing CLIP models.

The paper tackles the problem of Aerial-Ground person Re-Identification (AG-ReID) by proposing LATex, a framework that uses prompt-tuning with CLIP to incorporate attribute-based text knowledge, achieving improved performance on three benchmarks.

As an important task in intelligent transportation systems, Aerial-Ground person Re-IDentification (AG-ReID) aims to retrieve specific persons across heterogeneous cameras in different viewpoints. Previous methods typically adopt deep learning-based models, focusing on extracting view-invariant features. However, they usually overlook the semantic information in person attributes. In addition, existing training strategies often rely on full fine-tuning large-scale models, which significantly increases training costs. To address these issues, we propose a novel framework named LATex for AG-ReID, which adopts prompt-tuning strategies to leverage attribute-based text knowledge. Specifically, with the Contrastive Language-Image Pre-training (CLIP) model, we first propose an Attribute-aware Image Encoder (AIE) to extract both global semantic features and attribute-aware features from input images. Then, with these features, we propose a Prompted Attribute Classifier Group (PACG) to predict person attributes and obtain attribute representations. Finally, we design a Coupled Prompt Template (CPT) to transform attribute representations and view information into structured sentences. These sentences are processed by the text encoder of CLIP to generate more discriminative features. As a result, our framework can fully leverage attribute-based text knowledge to improve AG-ReID performance. Extensive experiments on three AG-ReID benchmarks demonstrate the effectiveness of our proposed methods. The source code is available at https://github.com/kevinhu314/LATex.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes