CVAIROApr 9

Accelerating Transformer-Based Monocular SLAM via Geometric Utility Scoring

arXiv:2604.0871851.3h-index: 14
AI Analysis

For SLAM researchers and practitioners, LeanGate offers a practical solution to the computational bottleneck of dense geometric decoding in GFM-based systems, enabling real-time deployment without sacrificing accuracy.

LeanGate reduces computational redundancy in GFM-based monocular SLAM by predicting geometric utility scores to skip over 90% of redundant frames, achieving >85% FLOPs reduction and 5x throughput speedup while maintaining accuracy.

Geometric Foundation Models (GFMs) have recently advanced monocular SLAM by providing robust, calibration-free 3D priors. However, deploying these models on dense video streams introduces significant computational redundancy. Current GFM-based SLAM systems typically rely on post hoc keyframe selection. Because of this, they must perform expensive dense geometric decoding simply to determine whether a frame contains novel geometry, resulting in late rejection and wasted computation. To mitigate this inefficiency, we propose LeanGate, a lightweight feed-forward frame-gating network. LeanGate predicts a geometric utility score to assess a frame's mapping value prior to the heavy GFM feature extraction and matching stages. As a predictive plug-and-play module, our approach bypasses over 90% of redundant frames. Evaluations on standard SLAM benchmarks demonstrate that LeanGate reduces tracking FLOPs by more than 85% and achieves a 5x end-to-end throughput speedup. Furthermore, it maintains the tracking and mapping accuracy of dense baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes