Pedestrian Detection by Exemplar-Guided Contrastive Learning
This addresses pedestrian detection challenges in real-world scenarios with diverse appearances, but it is incremental as it builds on existing contrastive learning and exemplar-based approaches.
The paper tackles pedestrian detection with substantial appearance diversities by proposing an exemplar-guided contrastive learning method to minimize semantic distances between different pedestrian appearances while maximizing distances from background, validated through extensive experiments on daytime and nighttime datasets.
Typical methods for pedestrian detection focus on either tackling mutual occlusions between crowded pedestrians, or dealing with the various scales of pedestrians. Detecting pedestrians with substantial appearance diversities such as different pedestrian silhouettes, different viewpoints or different dressing, remains a crucial challenge. Instead of learning each of these diverse pedestrian appearance features individually as most existing methods do, we propose to perform contrastive learning to guide the feature learning in such a way that the semantic distance between pedestrians with different appearances in the learned feature space is minimized to eliminate the appearance diversities, whilst the distance between pedestrians and background is maximized. To facilitate the efficiency and effectiveness of contrastive learning, we construct an exemplar dictionary with representative pedestrian appearances as prior knowledge to construct effective contrastive training pairs and thus guide contrastive learning. Besides, the constructed exemplar dictionary is further leveraged to evaluate the quality of pedestrian proposals during inference by measuring the semantic distance between the proposal and the exemplar dictionary. Extensive experiments on both daytime and nighttime pedestrian detection validate the effectiveness of the proposed method.