RO LG SYJun 7, 2024

Sim-to-Real Transfer of Deep Reinforcement Learning Agents for Online Coverage Path Planning

Arvi Jonnarth, Ola Johansson, Jie Zhao, Michael Felsberg

arXiv:2406.04920v35.76 citations

Originality Incremental advance

AI Analysis

This addresses the problem of efficient robotic coverage in unknown environments, such as for lawn mowing or search-and-rescue, with incremental improvements in sim-to-real transfer.

The paper tackles online coverage path planning (CPP) for unknown environments by proposing a deep reinforcement learning approach with an egocentric map representation and a total variation reward, achieving performance surpassing previous RL-based and specialized methods in simulation and successfully transferring to a real robot.

Coverage path planning (CPP) is the problem of finding a path that covers the entire free space of a confined area, with applications ranging from robotic lawn mowing to search-and-rescue. While for known environments, offline methods can find provably complete paths, and in some cases optimal solutions, unknown environments need to be planned online during mapping. We investigate the suitability of continuous-space reinforcement learning (RL) for this challenging problem, and propose a computationally feasible egocentric map representation based on frontiers, as well as a novel reward term based on total variation to promote complete coverage. Compared to existing classical methods, this approach allows for a flexible path space, and enables the agent to adapt to specific environment characteristics. Meanwhile, the deployment of RL models on real robot systems is difficult. Training from scratch may be infeasible due to slow convergence times, while transferring from simulation to reality, i.e. sim-to-real transfer, is a key challenge in itself. We bridge the sim-to-real gap through a semi-virtual environment, including a real robot and real-time aspects, while utilizing a simulated sensor and obstacles to enable environment randomization and automated episode resetting. We investigate what level of fine-tuning is needed for adapting to a realistic setting. Through extensive experiments, we show that our approach surpasses the performance of both previous RL-based approaches and highly specialized methods across multiple CPP variations in simulation. Meanwhile, our method successfully transfers to a real robot. Our code implementation can be found online.

View on arXiv PDF

Similar