PIXOR: Real-time 3D Object Detection from Point Clouds
This addresses the need for fast and accurate detection in autonomous vehicles, offering a notable improvement over existing methods.
The paper tackles real-time 3D object detection from point clouds for autonomous driving by proposing PIXOR, a Bird's Eye View-based, single-stage detector that achieves higher Average Precision than state-of-the-art methods while running at over 28 FPS on datasets like KITTI.
We address the problem of real-time 3D object detection from point clouds in the context of autonomous driving. Computation speed is critical as detection is a necessary component for safety. Existing approaches are, however, expensive in computation due to high dimensionality of point clouds. We utilize the 3D data more efficiently by representing the scene from the Bird's Eye View (BEV), and propose PIXOR, a proposal-free, single-stage detector that outputs oriented 3D object estimates decoded from pixel-wise neural network predictions. The input representation, network architecture, and model optimization are especially designed to balance high accuracy and real-time efficiency. We validate PIXOR on two datasets: the KITTI BEV object detection benchmark, and a large-scale 3D vehicle detection benchmark. In both datasets we show that the proposed detector surpasses other state-of-the-art methods notably in terms of Average Precision (AP), while still runs at >28 FPS.