CVNov 20, 2024

VADet: Multi-frame LiDAR 3D Object Detection using Variable Aggregation

arXiv:2411.13186v12 citationsh-index: 22WACV
Originality Highly original
AI Analysis

This work addresses a bottleneck in LiDAR-based autonomous driving systems by improving detection accuracy, though it is incremental as it builds on existing single-stage detectors.

The paper tackled the problem of diminishing returns and performance degradation in multi-frame LiDAR 3D object detection by proposing VADet, an adaptive method that aggregates frames per object based on properties like speed and point density, achieving state-of-the-art performance on the Waymo dataset.

Input aggregation is a simple technique used by state-of-the-art LiDAR 3D object detectors to improve detection. However, increasing aggregation is known to have diminishing returns and even performance degradation, due to objects responding differently to the number of aggregated frames. To address this limitation, we propose an efficient adaptive method, which we call Variable Aggregation Detection (VADet). Instead of aggregating the entire scene using a fixed number of frames, VADet performs aggregation per object, with the number of frames determined by an object's observed properties, such as speed and point density. VADet thus reduces the inherent trade-offs of fixed aggregation and is not architecture specific. To demonstrate its benefits, we apply VADet to three popular single-stage detectors and achieve state-of-the-art performance on the Waymo dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes