Object Viewpoint Classification Based 3D Bounding Box Estimation for Autonomous Vehicles
This work addresses the perception challenge for autonomous vehicles by enhancing monocular image-based 3D detection, though it is incremental as it builds on existing two-stage frameworks.
The paper tackles 3D object detection for autonomous vehicles by introducing a viewpoint classification method to determine geometric constraints between 2D and 3D bounding boxes, resulting in improved computational efficiency and precision, particularly in orientation angle estimation, as validated on the KITTI dataset.
3D object detection is one of the most important tasks for the perception systems of autonomous vehicles. With the significant success in the field of 2D object detection, several monocular image based 3D object detection algorithms have been proposed based on advanced 2D object detectors and the geometric constraints between the 2D and 3D bounding boxes. In this paper, we propose a novel method for determining the configuration of the 2D-3D geometric constraints which is based on the well-known 2D-3D two stage object detection framework. First, we discrete viewpoints in which the camera shots the object into 16 categories with respect to the observation relationship between camera and objects. Second, we design a viewpoint classifier by integrated a new sub-branch into the existing multi-branches CNN. Then, the configuration of geometric constraint between the 2D and 3D bounding boxes can be determined according to the output of this classifier. Extensive experiments on the KITTI dataset show that, our method not only improves the computational efficiency, but also increases the overall precision of the model, especially to the orientation angle estimation.