CVApr 21, 2023

VisFusion: Visibility-aware Online 3D Scene Reconstruction from Videos

arXiv:2304.10687v118 citationsh-index: 26Has Code
Originality Incremental advance
AI Analysis

This is an incremental improvement for 3D reconstruction in computer vision, addressing visibility and detail preservation in volumetric methods.

The paper tackles the problem of online 3D scene reconstruction from monocular videos by improving feature fusion and sparsification, resulting in superior performance with more scene details on benchmarks.

We propose VisFusion, a visibility-aware online 3D scene reconstruction approach from posed monocular videos. In particular, we aim to reconstruct the scene from volumetric features. Unlike previous reconstruction methods which aggregate features for each voxel from input views without considering its visibility, we aim to improve the feature fusion by explicitly inferring its visibility from a similarity matrix, computed from its projected features in each image pair. Following previous works, our model is a coarse-to-fine pipeline including a volume sparsification process. Different from their works which sparsify voxels globally with a fixed occupancy threshold, we perform the sparsification on a local feature volume along each visual ray to preserve at least one voxel per ray for more fine details. The sparse local volume is then fused with a global one for online reconstruction. We further propose to predict TSDF in a coarse-to-fine manner by learning its residuals across scales leading to better TSDF predictions. Experimental results on benchmarks show that our method can achieve superior performance with more scene details. Code is available at: https://github.com/huiyu-gao/VisFusion

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes