CVRODec 23, 2021

MDN-VO: Estimating Visual Odometry with Confidence

arXiv:2112.12812v114 citations
Originality Incremental advance
AI Analysis

This addresses the need for efficient and reliable visual odometry in robotics and autonomous systems, offering an incremental improvement by integrating uncertainty estimation into a deep learning framework.

The paper tackles the problem of visual odometry by proposing a deep learning model that estimates 6-DoF poses and their confidence, using a CNN-RNN hybrid and Mixture Density Network to handle uncertainties unsupervised. It achieves state-of-the-art performance on KITTI and nuScenes datasets and detects failure cases with predicted uncertainties.

Visual Odometry (VO) is used in many applications including robotics and autonomous systems. However, traditional approaches based on feature matching are computationally expensive and do not directly address failure cases, instead relying on heuristic methods to detect failure. In this work, we propose a deep learning-based VO model to efficiently estimate 6-DoF poses, as well as a confidence model for these estimates. We utilise a CNN - RNN hybrid model to learn feature representations from image sequences. We then employ a Mixture Density Network (MDN) which estimates camera motion as a mixture of Gaussians, based on the extracted spatio-temporal representations. Our model uses pose labels as a source of supervision, but derives uncertainties in an unsupervised manner. We evaluate the proposed model on the KITTI and nuScenes datasets and report extensive quantitative and qualitative results to analyse the performance of both pose and uncertainty estimation. Our experiments show that the proposed model exceeds state-of-the-art performance in addition to detecting failure cases using the predicted pose uncertainty.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes