CVDec 12, 2021

MVLayoutNet:3D layout reconstruction with multi-view panoramas

Zhihua Hu, Bo Duan, Yanfeng Zhang, Mingwei Sun, Jingwei Huang

arXiv:2112.06133v13.710 citations

Originality Incremental advance

AI Analysis

This improves 3D scene reconstruction for applications like robotics or AR/VR, though it is incremental as it builds on existing methods like MVSNet.

The paper tackles 3D layout reconstruction from multi-view panoramas by combining learned monocular layout estimation and multi-view stereo, resulting in a 21.7% and 20.6% reduction in depth RMSE on two datasets.

We present MVLayoutNet, an end-to-end network for holistic 3D reconstruction from multi-view panoramas. Our core contribution is to seamlessly combine learned monocular layout estimation and multi-view stereo (MVS) for accurate layout reconstruction in both 3D and image space. We jointly train a layout module to produce an initial layout and a novel MVS module to obtain accurate layout geometry. Unlike standard MVSNet [33], our MVS module takes a newly-proposed layout cost volume, which aggregates multi-view costs at the same depth layer into corresponding layout elements. We additionally provide an attention-based scheme that guides the MVS module to focus on structural regions. Such a design considers both local pixel-level costs and global holistic information for better reconstruction. Experiments show that our method outperforms state-of-the-arts in terms of depth rmse by 21.7% and 20.6% on the 2D-3D-S [1] and ZInD [5] datasets. Finally, our method leads to coherent layout geometry that enables the reconstruction of an entire scene.

View on arXiv PDF

Similar