NMS-Loss: Learning with Non-Maximum Suppression for Crowded Pedestrian Detection
This work addresses the issue of false positives and false negatives in pedestrian detection for crowded, occluded scenes, offering an incremental improvement by integrating NMS into the training process.
The paper tackles the problem of weak connection between training targets and evaluation metrics in object detection due to Non-Maximum Suppression (NMS), especially in crowded scenes, by proposing NMS-Loss, which enables end-to-end training of NMS without extra parameters, resulting in a Miss Rate of 5.92% on Caltech and 10.08% on CityPersons datasets, outperforming state-of-the-art methods.
Non-Maximum Suppression (NMS) is essential for object detection and affects the evaluation results by incorporating False Positives (FP) and False Negatives (FN), especially in crowd occlusion scenes. In this paper, we raise the problem of weak connection between the training targets and the evaluation metrics caused by NMS and propose a novel NMS-Loss making the NMS procedure can be trained end-to-end without any additional network parameters. Our NMS-Loss punishes two cases when FP is not suppressed and FN is wrongly eliminated by NMS. Specifically, we propose a pull loss to pull predictions with the same target close to each other, and a push loss to push predictions with different targets away from each other. Experimental results show that with the help of NMS-Loss, our detector, namely NMS-Ped, achieves impressive results with Miss Rate of 5.92% on Caltech dataset and 10.08% on CityPersons dataset, which are both better than state-of-the-art competitors.