Do We Really Need Immediate Resets? Rethinking Collision Handling for Efficient Robot Navigation
For researchers in robot navigation using DRL, this work provides a simple modification to training that improves learning efficiency and performance.
The paper challenges the standard practice of resetting the environment after every collision in DRL-based robot navigation, proposing a Multi-Collision reset Budget (MCB) framework that allows agents to retry difficult configurations within the same episode. Experiments show that MCB accelerates early-stage exploration and improves success rate and navigation efficiency over conventional baselines.
Should a single collision necessarily terminate an entire navigation episode? In most deep reinforcement learning (DRL) frameworks for robot navigation, this remains the standard practice: every collision immediately triggers a global environment reset and is penalized as a complete task failure. While a collision during deployment naturally indicates task failure, applying the same treatment during training prevents the agent from exploring challenging obstacle configurations, which slows learning progress in the early training phase. In this work, we challenge this convention and propose a Multi-Collision reset Budget (MCB) framework that decouples local collision termination from global environment resets, allowing the agent to retry difficult configurations within the same episode. Experiments on multiple simulated and real-world robotic platforms show that the framework accelerates early-stage exploration and improves both success rate and navigation efficiency over conventional single-collision reset baselines, with a small collision budget producing the largest gains.