ROAIMAOct 20, 2025

R2BC: Multi-Agent Imitation Learning from Single-Agent Demonstrations

arXiv:2510.18085v11 citationsh-index: 20
Originality Incremental advance
AI Analysis

This addresses the challenge of efficiently teaching collaborative multi-agent systems with limited human demonstrations, though it is incremental as it builds on existing imitation learning methods.

The paper tackles the problem of training multi-robot systems via imitation learning from single-agent demonstrations, introducing Round-Robin Behavior Cloning (R2BC) which allows a human to teleoperate one agent at a time, and shows that it matches or surpasses an oracle approach in simulated tasks and is deployed on physical robots.

Imitation Learning (IL) is a natural way for humans to teach robots, particularly when high-quality demonstrations are easy to obtain. While IL has been widely applied to single-robot settings, relatively few studies have addressed the extension of these methods to multi-agent systems, especially in settings where a single human must provide demonstrations to a team of collaborating robots. In this paper, we introduce and study Round-Robin Behavior Cloning (R2BC), a method that enables a single human operator to effectively train multi-robot systems through sequential, single-agent demonstrations. Our approach allows the human to teleoperate one agent at a time and incrementally teach multi-agent behavior to the entire system, without requiring demonstrations in the joint multi-agent action space. We show that R2BC methods match, and in some cases surpass, the performance of an oracle behavior cloning approach trained on privileged synchronized demonstrations across four multi-agent simulated tasks. Finally, we deploy R2BC on two physical robot tasks trained using real human demonstrations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes