CVDec 12, 2021

MVLayoutNet:3D layout reconstruction with multi-view panoramas

arXiv:2112.06133v110 citations
Originality Incremental advance
AI Analysis

This improves 3D scene reconstruction for applications like robotics or AR/VR, though it is incremental as it builds on existing methods like MVSNet.

The paper tackles 3D layout reconstruction from multi-view panoramas by combining learned monocular layout estimation and multi-view stereo, resulting in a 21.7% and 20.6% reduction in depth RMSE on two datasets.

We present MVLayoutNet, an end-to-end network for holistic 3D reconstruction from multi-view panoramas. Our core contribution is to seamlessly combine learned monocular layout estimation and multi-view stereo (MVS) for accurate layout reconstruction in both 3D and image space. We jointly train a layout module to produce an initial layout and a novel MVS module to obtain accurate layout geometry. Unlike standard MVSNet [33], our MVS module takes a newly-proposed layout cost volume, which aggregates multi-view costs at the same depth layer into corresponding layout elements. We additionally provide an attention-based scheme that guides the MVS module to focus on structural regions. Such a design considers both local pixel-level costs and global holistic information for better reconstruction. Experiments show that our method outperforms state-of-the-arts in terms of depth rmse by 21.7% and 20.6% on the 2D-3D-S [1] and ZInD [5] datasets. Finally, our method leads to coherent layout geometry that enables the reconstruction of an entire scene.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes