CVApr 1, 2021

LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth Rendering

arXiv:2104.00568v262 citations
AI Analysis

This addresses the problem of 3D room layout reconstruction from monocular 360 images for applications in computer vision, with incremental improvements in leveraging geometric information.

The paper tackles 360-degree room layout estimation by formulating it as depth prediction on the horizon line and proposes a differentiable depth rendering method to enable end-to-end training without ground truth depth, achieving state-of-the-art performance on benchmark datasets.

Although significant progress has been made in room layout estimation, most methods aim to reduce the loss in the 2D pixel coordinate rather than exploiting the room structure in the 3D space. Towards reconstructing the room layout in 3D, we formulate the task of 360 layout estimation as a problem of predicting depth on the horizon line of a panorama. Specifically, we propose the Differentiable Depth Rendering procedure to make the conversion from layout to depth prediction differentiable, thus making our proposed model end-to-end trainable while leveraging the 3D geometric information, without the need of providing the ground truth depth. Our method achieves state-of-the-art performance on numerous 360 layout benchmark datasets. Moreover, our formulation enables a pre-training step on the depth dataset, which further improves the generalizability of our layout estimation model.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes