Self-Supervised Person Detection in 2D Range Data using a Calibrated Camera
This method addresses the problem of limited annotated datasets for training 2D LiDAR-based person detectors, which is a significant challenge for robotics applications requiring robust person detection.
This paper proposes a method to automatically generate training labels for 2D LiDAR-based person detectors using bounding boxes from an image-based detector on a calibrated camera. Experiments on the JackRabbot dataset show that self-supervised detectors trained with these pseudo-labels outperform those trained on different datasets, achieving performance close to manually annotated detectors.
Deep learning is the essential building block of state-of-the-art person detectors in 2D range data. However, only a few annotated datasets are available for training and testing these deep networks, potentially limiting their performance when deployed in new environments or with different LiDAR models. We propose a method, which uses bounding boxes from an image-based detector (e.g. Faster R-CNN) on a calibrated camera to automatically generate training labels (called pseudo-labels) for 2D LiDAR-based person detectors. Through experiments on the JackRabbot dataset with two detector models, DROW3 and DR-SPAAM, we show that self-supervised detectors, trained or fine-tuned with pseudo-labels, outperform detectors trained only on a different dataset. Combined with robust training techniques, the self-supervised detectors reach a performance close to the ones trained using manual annotations of the target dataset. Our method is an effective way to improve person detectors during deployment without any additional labeling effort, and we release our source code to support relevant robotic applications.