CVDec 22, 2024

Leveraging Consistent Spatio-Temporal Correspondence for Robust Visual Odometry

arXiv:2412.16923v49 citationsh-index: 12AAAI
Originality Highly original
AI Analysis

This work addresses robust pose estimation for robotics and autonomous systems, representing a strong specific gain rather than a foundational advancement.

The paper tackles the problem of noisy and inconsistent optical flow matching in visual odometry by introducing STVO, a deep network architecture that leverages spatio-temporal cues to improve flow accuracy and consistency, achieving state-of-the-art performance with accuracy improvements of 77.8% on ETH3D and 38.9% on KITTI benchmarks.

Recent approaches to VO have significantly improved performance by using deep networks to predict optical flow between video frames. However, existing methods still suffer from noisy and inconsistent flow matching, making it difficult to handle challenging scenarios and long-sequence estimation. To overcome these challenges, we introduce Spatio-Temporal Visual Odometry (STVO), a novel deep network architecture that effectively leverages inherent spatio-temporal cues to enhance the accuracy and consistency of multi-frame flow matching. With more accurate and consistent flow matching, STVO can achieve better pose estimation through the bundle adjustment (BA). Specifically, STVO introduces two innovative components: 1) the Temporal Propagation Module that utilizes multi-frame information to extract and propagate temporal cues across adjacent frames, maintaining temporal consistency; 2) the Spatial Activation Module that utilizes geometric priors from the depth maps to enhance spatial consistency while filtering out excessive noise and incorrect matches. Our STVO achieves state-of-the-art performance on TUM-RGBD, EuRoc MAV, ETH3D and KITTI Odometry benchmarks. Notably, it improves accuracy by 77.8% on ETH3D benchmark and 38.9% on KITTI Odometry benchmark over the previous best methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes