CVMar 1, 2023

P$^2$SDF for Neural Indoor Scene Reconstruction

arXiv:2303.00236v13 citationsh-index: 59
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in 3D reconstruction for indoor environments, offering an incremental improvement over existing neural implicit methods by handling low-textured regions without additional ground-truth or layout assumptions.

The paper tackles the problem of neural implicit surface reconstruction in indoor scenes, which often fails in low-textured regions like floors and walls, by proposing a Pseudo Plane-regularized Signed Distance Field (P^2SDF) that leverages unsupervised plane estimation and a keypoint-guided ray sampling strategy, achieving competitive reconstruction performance in Manhattan scenes and generalizing well to non-Manhattan scenes.

Given only a set of images, neural implicit surface representation has shown its capability in 3D surface reconstruction. However, as the nature of per-scene optimization is based on the volumetric rendering of color, previous neural implicit surface reconstruction methods usually fail in low-textured regions, including the floors, walls, etc., which commonly exist for indoor scenes. Being aware of the fact that these low-textured regions usually correspond to planes, without introducing additional ground-truth supervisory signals or making additional assumptions about the room layout, we propose to leverage a novel Pseudo Plane-regularized Signed Distance Field (P$^2$SDF) for indoor scene reconstruction. Specifically, we consider adjacent pixels with similar colors to be on the same pseudo planes. The plane parameters are then estimated on the fly during training by an efficient and effective two-step scheme. Then the signed distances of the points on the planes are regularized by the estimated plane parameters in the training phase. As the unsupervised plane segments are usually noisy and inaccurate, we propose to assign different weights to the sampled points on the plane in plane estimation as well as the regularization loss. The weights come by fusing the plane segments from different views. As the sampled rays in the planar regions are redundant, leading to inefficient training, we further propose a keypoint-guided rays sampling strategy that attends to the informative textured regions with large color variations, and the implicit network gets a better reconstruction, compared with the original uniform ray sampling strategy. Experiments show that our P$^2$SDF achieves competitive reconstruction performance in Manhattan scenes. Further, as we do not introduce any additional room layout assumption, our P$^2$SDF generalizes well to the reconstruction of non-Manhattan scenes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes