ROApr 13

MARLIN: Multi-Agent Reinforcement Learning Guided by Language-Based Inter-Robot Negotiation

arXiv:2410.1438358.63 citationsh-index: 10Has Code
Predicted impact top 35% in RO · last 90 daysOriginality Incremental advance
AI Analysis

For multi-robot systems, MARLIN addresses the safety and inefficiency of early-stage reinforcement learning by leveraging language-based negotiation, offering a practical hybrid solution.

MARLIN introduces a hybrid framework combining large language models with multi-agent reinforcement learning, where LLMs guide high-level planning and negotiation during early training to improve safety and exploration. The approach achieves higher early training performance without sacrificing final performance in both simulated and physical robot experiments.

Multi-agent reinforcement learning is a key method for training multi-robot systems. Through rewarding or punishing robots over a series of episodes according to their performance, they can be trained and then deployed in the real world. However, poorly trained policies can lead to unsafe behaviour during early training stages. We introduce Multi-Agent Reinforcement Learning guided by language-based Inter-robot Negotiation (MARLIN), a hybrid framework in which large language models provide high-level planning before the reinforcement learning policy has learned effective behaviours. Robots use language models to negotiate actions and generate plans that guide policy learning. The system dynamically switches between reinforcement learning and language-model-based negotiation during training, enabling safer and more effective exploration. MARLIN is evaluated using both simulated and physical robots with local and remote language models. Results show that, compared to standard multi-agent reinforcement learning, the hybrid approach achieves higher performance in early training without reducing final performance. The code is available at https://github.com/SooratiLab/MARLIN.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes