Shape-Aware Monocular 3D Object Detection
This work addresses the challenge of accurate 3D object detection from single images for applications like autonomous driving, though it is incremental with a novel evaluation metric.
The paper tackles the problem of monocular 3D object detection by integrating an instance-segmentation head to improve robustness against occluded and truncated objects, achieving outperformance over baselines on both existing and proposed metrics while maintaining real-time efficiency.
The detection of 3D objects through a single perspective camera is a challenging issue. The anchor-free and keypoint-based models receive increasing attention recently due to their effectiveness and simplicity. However, most of these methods are vulnerable to occluded and truncated objects. In this paper, a single-stage monocular 3D object detection model is proposed. An instance-segmentation head is integrated into the model training, which allows the model to be aware of the visible shape of a target object. The detection largely avoids interference from irrelevant regions surrounding the target objects. In addition, we also reveal that the popular IoU-based evaluation metrics, which were originally designed for evaluating stereo or LiDAR-based detection methods, are insensitive to the improvement of monocular 3D object detection algorithms. A novel evaluation metric, namely average depth similarity (ADS) is proposed for the monocular 3D object detection models. Our method outperforms the baseline on both the popular and the proposed evaluation metrics while maintaining real-time efficiency.