ROCVDec 22, 2025

LoGoPlanner: Localization Grounded Navigation Policy with Metric-aware Visual Geometry

arXiv:2512.19629v24 citationsh-index: 21
Originality Incremental advance
AI Analysis

This addresses the problem of latency and error propagation in modular robot navigation systems, offering a more efficient solution for mobile robots in open-world settings, though it builds incrementally on prior end-to-end methods.

The paper tackles trajectory planning for mobile robots in unstructured environments by introducing LoGoPlanner, an end-to-end navigation framework that grounds predictions with absolute metric scale and reconstructs scene geometry, resulting in a 27.3% improvement over baselines and strong generalization.

Trajectory planning in unstructured environments is a fundamental and challenging capability for mobile robots. Traditional modular pipelines suffer from latency and cascading errors across perception, localization, mapping, and planning modules. Recent end-to-end learning methods map raw visual observations directly to control signals or trajectories, promising greater performance and efficiency in open-world settings. However, most prior end-to-end approaches still rely on separate localization modules that depend on accurate sensor extrinsic calibration for self-state estimation, thereby limiting generalization across embodiments and environments. We introduce LoGoPlanner, a localization-grounded, end-to-end navigation framework that addresses these limitations by: (1) finetuning a long-horizon visual-geometry backbone to ground predictions with absolute metric scale, thereby providing implicit state estimation for accurate localization; (2) reconstructing surrounding scene geometry from historical observations to supply dense, fine-grained environmental awareness for reliable obstacle avoidance; and (3) conditioning the policy on implicit geometry bootstrapped by the aforementioned auxiliary tasks, thereby reducing error propagation. We evaluate LoGoPlanner in both simulation and real-world settings, where its fully end-to-end design reduces cumulative error while metric-aware geometry memory enhances planning consistency and obstacle avoidance, leading to more than a 27.3\% improvement over oracle-localization baselines and strong generalization across embodiments and environments. The code and models have been made publicly available on the https://steinate.github.io/logoplanner.github.io.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes