CVAIMar 23, 2018

LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image

arXiv:1803.08999v1259 citations
Originality Incremental advance
AI Analysis

This addresses the problem of 3D scene understanding for computer vision applications, presenting an incremental improvement over prior methods like RoomNet.

The authors tackled the problem of reconstructing 3D room layouts from single RGB images, developing LayoutNet which generalizes across panorama and perspective images while handling both cuboid and more complex Manhattan layouts. Their method achieved competitive accuracy and speed compared to existing work, with among the best results for perspective images.

We propose an algorithm to predict room layout from a single image that generalizes across panoramas and perspective images, cuboid layouts and more general layouts (e.g. L-shape room). Our method operates directly on the panoramic image, rather than decomposing into perspective images as do recent works. Our network architecture is similar to that of RoomNet, but we show improvements due to aligning the image based on vanishing points, predicting multiple layout elements (corners, boundaries, size and translation), and fitting a constrained Manhattan layout to the resulting predictions. Our method compares well in speed and accuracy to other existing work on panoramas, achieves among the best accuracy for perspective images, and can handle both cuboid-shaped and more general Manhattan layouts.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes