CVMar 21, 2022

Masked Discrimination for Self-Supervised Learning on Point Clouds

arXiv:2203.11183v2243 citationsh-index: 46Has Code
Originality Highly original
AI Analysis

This work addresses the problem of efficient and effective self-supervised learning for point cloud understanding, which is incremental as it adapts masked autoencoding from images/language to point clouds with specific modifications.

The paper tackles the challenge of applying masked autoencoding to point clouds for self-supervised learning by proposing MaskPoint, a discriminative pretraining framework that uses binary classification between masked points and noise, achieving state-of-the-art results in tasks like 3D shape classification and a 4.1x pretraining speedup on ScanNet.

Masked autoencoding has achieved great success for self-supervised learning in the image and language domains. However, mask based pretraining has yet to show benefits for point cloud understanding, likely due to standard backbones like PointNet being unable to properly handle the training versus testing distribution mismatch introduced by masking during training. In this paper, we bridge this gap by proposing a discriminative mask pretraining Transformer framework, MaskPoint}, for point clouds. Our key idea is to represent the point cloud as discrete occupancy values (1 if part of the point cloud; 0 if not), and perform simple binary classification between masked object points and sampled noise points as the proxy task. In this way, our approach is robust to the point sampling variance in point clouds, and facilitates learning rich representations. We evaluate our pretrained models across several downstream tasks, including 3D shape classification, segmentation, and real-word object detection, and demonstrate state-of-the-art results while achieving a significant pretraining speedup (e.g., 4.1x on ScanNet) compared to the prior state-of-the-art Transformer baseline. Code is available at https://github.com/haotian-liu/MaskPoint.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes