CV ROMar 21, 2023

Monocular Visual-Inertial Depth Estimation

Diana Wofk, René Ranftl, Matthias Müller, Vladlen Koltun

arXiv:2303.12134v113.125 citationsh-index: 113Has Code

Originality Incremental advance

AI Analysis

This work addresses depth estimation for robotics and autonomous systems by improving accuracy with sparse inputs, though it is incremental as it builds on existing monocular and visual-inertial methods.

The paper tackles the problem of monocular visual-inertial depth estimation by integrating depth estimation with visual-inertial odometry to produce dense, metric-scale depth maps, achieving up to 30% reduction in inverse RMSE with dense alignment and over 50% lower iRMSE compared to state-of-the-art methods using only 150 sparse points.

We present a visual-inertial depth estimation pipeline that integrates monocular depth estimation and visual-inertial odometry to produce dense depth estimates with metric scale. Our approach performs global scale and shift alignment against sparse metric depth, followed by learning-based dense alignment. We evaluate on the TartanAir and VOID datasets, observing up to 30% reduction in inverse RMSE with dense scale alignment relative to performing just global alignment alone. Our approach is especially competitive at low density; with just 150 sparse metric depth points, our dense-to-dense depth alignment method achieves over 50% lower iRMSE over sparse-to-dense depth completion by KBNet, currently the state of the art on VOID. We demonstrate successful zero-shot transfer from synthetic TartanAir to real-world VOID data and perform generalization tests on NYUv2 and VCU-RVI. Our approach is modular and is compatible with a variety of monocular depth estimation models. Video: https://youtu.be/IMwiKwSpshQ Code: https://github.com/isl-org/VI-Depth

View on arXiv PDF Code

Similar