CVIVApr 7, 2020

End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection

arXiv:2004.03080v2194 citationsHas Code
AI Analysis

This work addresses the need for accurate and affordable 3D object detection in autonomous driving, representing an incremental improvement by enabling end-to-end training in pseudo-LiDAR pipelines.

The paper tackles the problem of 3D object detection for autonomous driving by introducing an end-to-end trainable pseudo-LiDAR framework, which improves over previous methods and achieves the highest entry on the KITTI image-based 3D object detection leaderboard at submission time.

Reliable and accurate 3D object detection is a necessity for safe autonomous driving. Although LiDAR sensors can provide accurate 3D point cloud estimates of the environment, they are also prohibitively expensive for many settings. Recently, the introduction of pseudo-LiDAR (PL) has led to a drastic reduction in the accuracy gap between methods based on LiDAR sensors and those based on cheap stereo cameras. PL combines state-of-the-art deep neural networks for 3D depth estimation with those for 3D object detection by converting 2D depth map outputs to 3D point cloud inputs. However, so far these two networks have to be trained separately. In this paper, we introduce a new framework based on differentiable Change of Representation (CoR) modules that allow the entire PL pipeline to be trained end-to-end. The resulting framework is compatible with most state-of-the-art networks for both tasks and in combination with PointRCNN improves over PL consistently across all benchmarks -- yielding the highest entry on the KITTI image-based 3D object detection leaderboard at the time of submission. Our code will be made available at https://github.com/mileyan/pseudo-LiDAR_e2e.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes