CV ROApr 8, 2024

Better Monocular 3D Detectors with LiDAR from the Past

Yurong You, Cheng Perng Phoo, Carlos Andres Diaz-Ruiz, Katie Z Luo, Wei-Lun Chao, Mark Campbell, Bharath Hariharan, Kilian Q Weinberger

arXiv:2404.05139v23.72 citationsh-index: 79Has CodeICRA

Originality Incremental advance

AI Analysis

This work addresses the performance gap between camera-based and LiDAR-based 3D detectors for affordable autonomous vehicles, offering an incremental improvement by utilizing existing LiDAR data.

The paper tackles the problem of improving monocular 3D object detection for autonomous driving by leveraging unlabeled historical LiDAR data from past traversals, achieving up to a 9 AP performance gain with minimal additional latency and storage cost.

Accurate 3D object detection is crucial to autonomous driving. Though LiDAR-based detectors have achieved impressive performance, the high cost of LiDAR sensors precludes their widespread adoption in affordable vehicles. Camera-based detectors are cheaper alternatives but often suffer inferior performance compared to their LiDAR-based counterparts due to inherent depth ambiguities in images. In this work, we seek to improve monocular 3D detectors by leveraging unlabeled historical LiDAR data. Specifically, at inference time, we assume that the camera-based detectors have access to multiple unlabeled LiDAR scans from past traversals at locations of interest (potentially from other high-end vehicles equipped with LiDAR sensors). Under this setup, we proposed a novel, simple, and end-to-end trainable framework, termed AsyncDepth, to effectively extract relevant features from asynchronous LiDAR traversals of the same location for monocular 3D detectors. We show consistent and significant performance gain (up to 9 AP) across multiple state-of-the-art models and datasets with a negligible additional latency of 9.66 ms and a small storage cost.

View on arXiv PDF Code

Similar