Look Ma, No Ground Truth! Ground-Truth-Free Tuning of Structure from Motion and Visual SLAM
This addresses a scalability bottleneck for SfM and SLAM applications in diverse environments by enabling evaluation without ground truth, though it is incremental as it builds on existing sensitivity estimation techniques.
The paper tackles the problem of evaluating and tuning Structure from Motion and Visual SLAM systems without relying on costly geometric ground truth, proposing a ground-truth-free method that uses sensitivity estimation from noisy images and shows strong correlation with traditional benchmarks.
Evaluation is critical to both developing and tuning Structure from Motion (SfM) and Visual SLAM (VSLAM) systems, but is universally reliant on high-quality geometric ground truth -- a resource that is not only costly and time-intensive but, in many cases, entirely unobtainable. This dependency on ground truth restricts SfM and SLAM applications across diverse environments and limits scalability to real-world scenarios. In this work, we propose a novel ground-truth-free (GTF) evaluation methodology that eliminates the need for geometric ground truth, instead using sensitivity estimation via sampling from both original and noisy versions of input images. Our approach shows strong correlation with traditional ground-truth-based benchmarks and supports GTF hyperparameter tuning. Removing the need for ground truth opens up new opportunities to leverage a much larger number of dataset sources, and for self-supervised and online tuning, with the potential for a data-driven breakthrough analogous to what has occurred in generative AI.