ROMar 16

NavThinker: Action-Conditioned World Models for Coupled Prediction and Planning in Social Navigation

arXiv:2603.1535987.03 citationsh-index: 8Has Code
AI Analysis

This work addresses the problem of safe and effective robot navigation in dynamic human environments, offering a novel approach that improves performance in social navigation tasks.

The paper tackled the coupled prediction-planning challenge in social navigation by proposing NavThinker, a framework that integrates an action-conditioned world model with reinforcement learning, achieving state-of-the-art navigation success in experiments on Social-HM3D and demonstrating zero-shot transfer to other datasets and real-world deployment.

Social navigation requires robots to act safely in dynamic human environments. Effective behavior demands thinking ahead: reasoning about how the scene and pedestrians evolve under different robot actions rather than reacting to current observations alone. This creates a coupled prediction-planning challenge, where robot actions and human motion mutually influence each other. To address this challenge, we propose NavThinker, a future-aware framework that couples an action-conditioned world model with on-policy reinforcement learning. The world model operates in the Depth Anything V2 patch feature space and performs autoregressive prediction of future scene geometry and human motion; multi-head decoders then produce future depth maps and human trajectories, yielding a future-aware state aligned with traversability and interaction risk. Crucially, we train the policy with DD-PPO while injecting world-model think-ahead signals via: (i) action-conditioned future features fused into the current observation embedding and (ii) social reward shaping from predicted human trajectories. Experiments on single- and multi-robot Social-HM3D show state-of-the-art navigation success, with zero-shot transfer to Social-MP3D and real-world deployment on a Unitree Go2, validating generalization and practical applicability. Webpage: https://github.com/hutslib/NavThinker.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes