BirdNet+: End-to-End 3D Object Detection in LiDAR Bird's Eye View
This improves 3D object detection for autonomous vehicles by addressing information loss in BEV projections, though it is incremental over its predecessor.
The paper tackles 3D object detection from LiDAR Bird's Eye View images by proposing an end-to-end framework that infers oriented 3D boxes without post-processing, achieving state-of-the-art results on the KITTI benchmark across all categories.
On-board 3D object detection in autonomous vehicles often relies on geometry information captured by LiDAR devices. Albeit image features are typically preferred for detection, numerous approaches take only spatial data as input. Exploiting this information in inference usually involves the use of compact representations such as the Bird's Eye View (BEV) projection, which entails a loss of information and thus hinders the joint inference of all the parameters of the objects' 3D boxes. In this paper, we present a fully end-to-end 3D object detection framework that can infer oriented 3D boxes solely from BEV images by using a two-stage object detector and ad-hoc regression branches, eliminating the need for a post-processing stage. The method outperforms its predecessor (BirdNet) by a large margin and obtains state-of-the-art results on the KITTI 3D Object Detection Benchmark for all the categories in evaluation.