CVJul 30, 2022

Point Primitive Transformer for Long-Term 4D Point Cloud Video Understanding

arXiv:2208.00281v244 citationsh-index: 22
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient and effective 4D point cloud video analysis for applications like robotics or autonomous driving, though it appears incremental as it builds on existing transformer and hierarchical approaches.

The paper tackles the problem of long-term 4D point cloud video understanding by proposing a hierarchical backbone that uses primitive planes as a mid-level representation to capture spatial-temporal context, resulting in outperforming previous state-of-the-art methods on various tasks.

This paper proposes a 4D backbone for long-term point cloud video understanding. A typical way to capture spatial-temporal context is using 4Dconv or transformer without hierarchy. However, those methods are neither effective nor efficient enough due to camera motion, scene changes, sampling patterns, and the complexity of 4D data. To address those issues, we leverage the primitive plane as a mid-level representation to capture the long-term spatial-temporal context in 4D point cloud videos and propose a novel hierarchical backbone named Point Primitive Transformer(PPTr), which is mainly composed of intra-primitive point transformers and primitive transformers. Extensive experiments show that PPTr outperforms the previous state of the arts on different tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes