CVDec 21, 2020

3D Object Detection with Pointformer

arXiv:2012.11409v3455 citations
AI Analysis

This work provides an incremental improvement in 3D object detection performance for researchers and practitioners working with point cloud data.

This paper introduces Pointformer, a Transformer-based backbone for 3D object detection from point clouds. It significantly improves state-of-the-art object detection models on both indoor and outdoor datasets by learning context-dependent region features and context-aware scene representations.

Feature learning for 3D object detection from point clouds is very challenging due to the irregularity of 3D point cloud data. In this paper, we propose Pointformer, a Transformer backbone designed for 3D point clouds to learn features effectively. Specifically, a Local Transformer module is employed to model interactions among points in a local region, which learns context-dependent region features at an object level. A Global Transformer is designed to learn context-aware representations at the scene level. To further capture the dependencies among multi-scale representations, we propose Local-Global Transformer to integrate local features with global features from higher resolution. In addition, we introduce an efficient coordinate refinement module to shift down-sampled points closer to object centroids, which improves object proposal generation. We use Pointformer as the backbone for state-of-the-art object detection models and demonstrate significant improvements over original models on both indoor and outdoor datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes