CVLGROSep 28, 2023

LEF: Late-to-Early Temporal Fusion for LiDAR 3D Object Detection

arXiv:2309.16870v17 citationsh-index: 30
Originality Incremental advance
AI Analysis

This work addresses the problem of accurately detecting objects, especially large ones, in autonomous driving systems, representing an incremental improvement with specific optimizations like feature reduction and training techniques.

The paper tackled 3D object detection from temporal LiDAR point clouds by proposing a late-to-early recurrent feature fusion scheme, which improved detection performance, particularly for large objects, as demonstrated on the Waymo Open Dataset.

We propose a late-to-early recurrent feature fusion scheme for 3D object detection using temporal LiDAR point clouds. Our main motivation is fusing object-aware latent embeddings into the early stages of a 3D object detector. This feature fusion strategy enables the model to better capture the shapes and poses for challenging objects, compared with learning from raw points directly. Our method conducts late-to-early feature fusion in a recurrent manner. This is achieved by enforcing window-based attention blocks upon temporally calibrated and aligned sparse pillar tokens. Leveraging bird's eye view foreground pillar segmentation, we reduce the number of sparse history features that our model needs to fuse into its current frame by 10$\times$. We also propose a stochastic-length FrameDrop training technique, which generalizes the model to variable frame lengths at inference for improved performance without retraining. We evaluate our method on the widely adopted Waymo Open Dataset and demonstrate improvement on 3D object detection against the baseline model, especially for the challenging category of large objects.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes