Hilbert-Augmented Reinforcement Learning for Scalable Multi-Robot Coverage and Exploration
This work addresses scalability and efficiency challenges in multi-robot systems for swarm and legged robotics, representing an incremental improvement through the integration of geometric priors.
The paper tackled the problem of scalable multi-robot coverage and exploration by integrating Hilbert space-filling priors into reinforcement learning methods like DQN and PPO, resulting in improved coverage efficiency, reduced redundancy, and faster convergence compared to baselines, with validation on a Boston Dynamics Spot robot showing reliable performance.
We present a coverage framework that integrates Hilbert space-filling priors into decentralized multi-robot learning and execution. We augment DQN and PPO with Hilbert-based spatial indices to structure exploration and reduce redundancy in sparse-reward environments, and we evaluate scalability in multi-robot grid coverage. We further describe a waypoint interface that converts Hilbert orderings into curvature-bounded, time-parameterized SE(2) trajectories (planar (x, y, θ)), enabling onboard feasibility on resource-constrained robots. Experiments show improvements in coverage efficiency, redundancy, and convergence speed over DQN/PPO baselines. In addition, we validate the approach on a Boston Dynamics Spot legged robot, executing the generated trajectories in indoor environments and observing reliable coverage with low redundancy. These results indicate that geometric priors improve autonomy and scalability for swarm and legged robotics.