CVDec 16, 2020

Point Transformer

arXiv:2012.09164v239.0281 citationsHas Code
Originality Highly original
AI Analysis

This work provides a significant improvement in 3D point cloud processing, particularly for large-scale semantic scene segmentation, benefiting applications requiring accurate 3D scene understanding.

This paper applies self-attention networks to 3D point cloud processing, designing self-attention layers for various tasks. The Point Transformer achieves a mean Intersection over Union (mIoU) of 70.4% on the S3DIS dataset for semantic scene segmentation, surpassing previous state-of-the-art by 3.3 percentage points.

Self-attention networks have revolutionized natural language processing and are making impressive strides in image analysis tasks such as image classification and object detection. Inspired by this success, we investigate the application of self-attention networks to 3D point cloud processing. We design self-attention layers for point clouds and use these to construct self-attention networks for tasks such as semantic scene segmentation, object part segmentation, and object classification. Our Point Transformer design improves upon prior work across domains and tasks. For example, on the challenging S3DIS dataset for large-scale semantic scene segmentation, the Point Transformer attains an mIoU of 70.4% on Area 5, outperforming the strongest prior model by 3.3 absolute percentage points and crossing the 70% mIoU threshold for the first time.

Code Implementations24 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes