CVJan 11, 2024

Surface Normal Estimation with Transformers

arXiv:2401.05745v11 citationsh-index: 23
Originality Incremental advance
AI Analysis

This work addresses the problem of accurate surface normal estimation for 3D vision tasks, offering a simplified approach that eliminates the need for hand-designed modules, though it is incremental as it builds on existing learning-based methods.

The authors tackled surface normal estimation from noisy and variable-density point clouds by proposing a Transformer-based model, which achieved state-of-the-art performance on synthetic and real-world datasets with improved noise resilience and faster inference.

We propose the use of a Transformer to accurately predict normals from point clouds with noise and density variations. Previous learning-based methods utilize PointNet variants to explicitly extract multi-scale features at different input scales, then focus on a surface fitting method by which local point cloud neighborhoods are fitted to a geometric surface approximated by either a polynomial function or a multi-layer perceptron (MLP). However, fitting surfaces to fixed-order polynomial functions can suffer from overfitting or underfitting, and learning MLP-represented hyper-surfaces requires pre-generated per-point weights. To avoid these limitations, we first unify the design choices in previous works and then propose a simplified Transformer-based model to extract richer and more robust geometric features for the surface normal estimation task. Through extensive experiments, we demonstrate that our Transformer-based method achieves state-of-the-art performance on both the synthetic shape dataset PCPNet, and the real-world indoor scene dataset SceneNN, exhibiting more noise-resilient behavior and significantly faster inference. Most importantly, we demonstrate that the sophisticated hand-designed modules in existing works are not necessary to excel at the task of surface normal estimation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes