CVAIOct 4, 2022

Dense Prediction Transformer for Scale Estimation in Monocular Visual Odometry

arXiv:2210.01723v117 citationsh-index: 13
Originality Incremental advance
AI Analysis

This addresses scale drift for applications like autonomous vehicles and medical robots, but it is incremental as it adapts an existing model to a known bottleneck.

The paper tackled the scale ambiguity problem in monocular visual odometry by applying a dense prediction transformer model for scale estimation, achieving competitive state-of-the-art performance on a benchmark.

Monocular visual odometry consists of the estimation of the position of an agent through images of a single camera, and it is applied in autonomous vehicles, medical robots, and augmented reality. However, monocular systems suffer from the scale ambiguity problem due to the lack of depth information in 2D frames. This paper contributes by showing an application of the dense prediction transformer model for scale estimation in monocular visual odometry systems. Experimental results show that the scale drift problem of monocular systems can be reduced through the accurate estimation of the depth map by this model, achieving competitive state-of-the-art performance on a visual odometry benchmark.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes