CVJul 4, 2024

UniPlane: Unified Plane Detection and Reconstruction from Posed Monocular Videos

arXiv:2407.03594v13 citationsh-index: 3
Originality Highly original
AI Analysis

This work addresses the challenge of accurate 3D plane reconstruction from videos for applications in robotics and augmented reality, representing a novel integration rather than an incremental step.

The paper tackles the problem of plane detection and reconstruction from posed monocular videos by introducing UniPlane, a unified method that directly optimizes reconstruction quality and leverages temporal information, resulting in a +4.6 F-score improvement in geometry over state-of-the-art methods.

We present UniPlane, a novel method that unifies plane detection and reconstruction from posed monocular videos. Unlike existing methods that detect planes from local observations and associate them across the video for the final reconstruction, UniPlane unifies both the detection and the reconstruction tasks in a single network, which allows us to directly optimize final reconstruction quality and fully leverage temporal information. Specifically, we build a Transformers-based deep neural network that jointly constructs a 3D feature volume for the environment and estimates a set of per-plane embeddings as queries. UniPlane directly reconstructs the 3D planes by taking dot products between voxel embeddings and the plane embeddings followed by binary thresholding. Extensive experiments on real-world datasets demonstrate that UniPlane outperforms state-of-the-art methods in both plane detection and reconstruction tasks, achieving +4.6 in F-score in geometry as well as consistent improvements in other geometry and segmentation metrics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes