General 3D Room Layout from a Single View by Render-and-Compare
This addresses the challenge of accurate 3D layout estimation for applications like robotics and AR/VR, though it is incremental as it builds on single-view methods by adding occlusion handling.
The paper tackles the problem of reconstructing 3D room layouts from a single view, overcoming limitations of previous methods restricted to cuboid shapes, and achieves this by introducing a render-and-compare approach that handles occlusions, validated on a new dataset of 293 images with precise annotations.
We present a novel method to reconstruct the 3D layout of a room (walls, floors, ceilings) from a single perspective view in challenging conditions, by contrast with previous single-view methods restricted to cuboid-shaped layouts. This input view can consist of a color image only, but considering a depth map results in a more accurate reconstruction. Our approach is formalized as solving a constrained discrete optimization problem to find the set of 3D polygons that constitute the layout. In order to deal with occlusions between components of the layout, which is a problem ignored by previous works, we introduce an analysis-by-synthesis method to iteratively refine the 3D layout estimate. As no dataset was available to evaluate our method quantitatively, we created one together with several appropriate metrics. Our dataset consists of 293 images from ScanNet, which we annotated with precise 3D layouts. It offers three times more samples than the popular NYUv2 303 benchmark, and a much larger variety of layouts.