CVJan 17, 2021

PLUMENet: Efficient 3D Object Detection from Stereo Images

Yan Wang, Bin Yang, Rui Hu, Ming Liang, Raquel Urtasun

arXiv:2101.06594v312.644 citations

Originality Incremental advance

AI Analysis

This addresses the cost and efficiency problem for robotic applications like self-driving vehicles by improving stereo-based 3D detection, though it is an incremental advance over existing methods.

The paper tackles the suboptimal two-step approach for 3D object detection from stereo images by proposing PLUMENet, which unifies depth estimation and object detection in the same metric space, achieving state-of-the-art performance with faster inference times on the KITTI benchmark.

3D object detection is a key component of many robotic applications such as self-driving vehicles. While many approaches rely on expensive 3D sensors such as LiDAR to produce accurate 3D estimates, methods that exploit stereo cameras have recently shown promising results at a lower cost. Existing approaches tackle this problem in two steps: first depth estimation from stereo images is performed to produce a pseudo LiDAR point cloud, which is then used as input to a 3D object detector. However, this approach is suboptimal due to the representation mismatch, as the two tasks are optimized in two different metric spaces. In this paper we propose a model that unifies these two tasks and performs them in the same metric space. Specifically, we directly construct a pseudo LiDAR feature volume (PLUME) in 3D space, which is then used to solve both depth estimation and object detection tasks. Our approach achieves state-of-the-art performance with much faster inference times when compared to existing methods on the challenging KITTI benchmark.

View on arXiv PDF

Similar