CVMar 12, 2024

Q-SLAM: Quadric Representations for Monocular SLAM

Chensheng Peng, Chenfeng Xu, Yue Wang, Mingyu Ding, Heng Yang, Masayoshi Tomizuka, Kurt Keutzer, Marco Pavone, Wei Zhan

Berkeley

arXiv:2403.08125v212.814 citationsh-index: 41CoRL

Originality Incremental advance

AI Analysis

This work addresses efficiency and accuracy challenges in 3D scene reconstruction for robotics and AR/VR applications, representing an incremental improvement over existing volumetric SLAM methods.

The paper tackles the problem of inefficient volumetric representations in monocular SLAM by decomposing rigid scene components into quadric surfaces, which improves depth estimation accuracy and reduces computational requirements compared to previous NeRF-SLAM systems.

In this paper, we reimagine volumetric representations through the lens of quadrics. We posit that rigid scene components can be effectively decomposed into quadric surfaces. Leveraging this assumption, we reshape the volumetric representations with million of cubes by several quadric planes, which results in more accurate and efficient modeling of 3D scenes in SLAM contexts. First, we use the quadric assumption to rectify noisy depth estimations from RGB inputs. This step significantly improves depth estimation accuracy, and allows us to efficiently sample ray points around quadric planes instead of the entire volume space in previous NeRF-SLAM systems. Second, we introduce a novel quadric-decomposed transformer to aggregate information across quadrics. The quadric semantics are not only explicitly used for depth correction and scene decomposition, but also serve as an implicit supervision signal for the mapping network. Through rigorous experimental evaluation, our method exhibits superior performance over other approaches relying on estimated depth, and achieves comparable accuracy to methods utilizing ground truth depth on both synthetic and real-world datasets.

View on arXiv PDF

Similar