Deep Online Correction for Monocular Visual Odometry
This work addresses accuracy and efficiency challenges in monocular visual odometry for robotics and autonomous systems, representing an incremental improvement over existing methods.
The authors tackled the problem of improving monocular visual odometry by proposing a deep online correction framework that refines initial pose predictions from CNNs through gradient-based photometric error minimization during inference, achieving a relative transform error of 2.0% on the KITTI Odometry benchmark for Seq. 09.
In this work, we propose a novel deep online correction (DOC) framework for monocular visual odometry. The whole pipeline has two stages: First, depth maps and initial poses are obtained from convolutional neural networks (CNNs) trained in self-supervised manners. Second, the poses predicted by CNNs are further improved by minimizing photometric errors via gradient updates of poses during inference phases. The benefits of our proposed method are twofold: 1) Different from online-learning methods, DOC does not need to calculate gradient propagation for parameters of CNNs. Thus, it saves more computation resources during inference phases. 2) Unlike hybrid methods that combine CNNs with traditional methods, DOC fully relies on deep learning (DL) frameworks. Though without complex back-end optimization modules, our method achieves outstanding performance with relative transform error (RTE) = 2.0% on KITTI Odometry benchmark for Seq. 09, which outperforms traditional monocular VO frameworks and is comparable to hybrid methods.