CVJan 29, 2017

VINet: Visual-Inertial Odometry as a Sequence-to-Sequence Learning Problem

Ronald Clark, Sen Wang, Hongkai Wen, Andrew Markham, Niki Trigoni

arXiv:1701.08376v223.6404 citations

Originality Highly original

AI Analysis

This addresses motion estimation for robotics or autonomous systems by automating sensor fusion, though it is incremental as it builds on sequence-to-sequence learning.

The paper tackles visual-inertial odometry by proposing VINet, an end-to-end trainable sequence-to-sequence learning method that fuses visual and inertial data at an intermediate feature level, eliminating the need for manual sensor synchronization and calibration. It shows competitive performance with state-of-the-art traditional methods under accurate calibration and outperforms them in the presence of errors.

In this paper we present an on-manifold sequence-to-sequence learning approach to motion estimation using visual and inertial sensors. It is to the best of our knowledge the first end-to-end trainable method for visual-inertial odometry which performs fusion of the data at an intermediate feature-representation level. Our method has numerous advantages over traditional approaches. Specifically, it eliminates the need for tedious manual synchronization of the camera and IMU as well as eliminating the need for manual calibration between the IMU and camera. A further advantage is that our model naturally and elegantly incorporates domain specific information which significantly mitigates drift. We show that our approach is competitive with state-of-the-art traditional methods when accurate calibration data is available and can be trained to outperform them in the presence of calibration and synchronization errors.

View on arXiv PDF

Similar