CVMar 10, 2022

PETR: Position Embedding Transformation for Multi-View 3D Object Detection

arXiv:2203.05625v3822 citationsh-index: 86Has Code
Originality Highly original
AI Analysis

It addresses 3D object detection for autonomous driving systems, providing a strong baseline for future research.

The paper tackles multi-view 3D object detection by introducing PETR, which encodes 3D coordinate position information into image features, achieving state-of-the-art performance with 50.4% NDS and 44.1% mAP on the nuScenes dataset.

In this paper, we develop position embedding transformation (PETR) for multi-view 3D object detection. PETR encodes the position information of 3D coordinates into image features, producing the 3D position-aware features. Object query can perceive the 3D position-aware features and perform end-to-end object detection. PETR achieves state-of-the-art performance (50.4% NDS and 44.1% mAP) on standard nuScenes dataset and ranks 1st place on the benchmark. It can serve as a simple yet strong baseline for future research. Code is available at \url{https://github.com/megvii-research/PETR}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes