Walk With Me: Long-Horizon Social Navigation for Human-Centric Outdoor Assistance
This work addresses the need for socially compliant, long-horizon robot navigation in outdoor settings without relying on pre-built HD maps, which is crucial for assistive robots in human-centric environments.
Walk with Me introduces a map-free framework for long-horizon social navigation in outdoor environments, enabling robots to follow high-level human instructions by combining GPS-based waypoint planning, vision-language models for intent grounding, and safety-aware reasoning. In real-world tests, it achieves a 90% success rate in reaching destinations and reduces unsafe behaviors by 40% compared to baselines.
Assisting humans in open-world outdoor environments requires robots to translate high-level natural-language intentions into safe, long-horizon, and socially compliant navigation behavior. Existing map-based methods rely on costly pre-built HD maps, while learning-based policies are mostly limited to indoor and short-horizon settings. To bridge this gap, we propose Walk with Me, a map-free framework for long-horizon social navigation from high-level human instructions. Walk with Me leverages GPS context and lightweight candidate points-of-interest from a public map API for semantic destination grounding and waypoint proposal. A High-Level Vision-Language Model grounds abstract instructions into concrete destinations and plans coarse waypoint sequences. During execution, an observation-aware routing mechanism determines whether the Low-Level Vision-Language-Action policy can handle the current situation or whether explicit safety reasoning from the High-Level VLM is needed. Routine segments are executed by the Low-Level VLA, while complex situations such as crowded crossings trigger high-level reasoning and stop-and-wait behavior when unsafe. By combining semantic intent grounding, map-free long-horizon planning, safety-aware reasoning, and low-level action generation, Walk with Me enables practical outdoor social navigation for human-centric assistance.