AIJul 27, 2025

Concept Learning for Cooperative Multi-Agent Reinforcement Learning

arXiv:2507.20143v1h-index: 72025 IEEE 26th China Conference on System Simulation Technology and its Applications (CCSSTA)
Originality Highly original
AI Analysis

This addresses the problem of trustworthiness and interpretability in cooperative multi-agent systems for AI researchers and practitioners, offering a novel approach beyond the performance-interpretability trade-off.

The paper tackles the lack of transparency in multi-agent reinforcement learning by proposing an interpretable value decomposition framework using concept bottleneck models, achieving superior performance on StarCraft II and level-based foraging benchmarks compared to state-of-the-art methods.

Despite substantial progress in applying neural networks (NN) to multi-agent reinforcement learning (MARL) areas, they still largely suffer from a lack of transparency and interoperability. However, its implicit cooperative mechanism is not yet fully understood due to black-box networks. In this work, we study an interpretable value decomposition framework via concept bottleneck models, which promote trustworthiness by conditioning credit assignment on an intermediate level of human-like cooperation concepts. To address this problem, we propose a novel value-based method, named Concepts learning for Multi-agent Q-learning (CMQ), that goes beyond the current performance-vs-interpretability trade-off by learning interpretable cooperation concepts. CMQ represents each cooperation concept as a supervised vector, as opposed to existing models where the information flowing through their end-to-end mechanism is concept-agnostic. Intuitively, using individual action value conditioning on global state embeddings to represent each concept allows for extra cooperation representation capacity. Empirical evaluations on the StarCraft II micromanagement challenge and level-based foraging (LBF) show that CMQ achieves superior performance compared with the state-of-the-art counterparts. The results also demonstrate that CMQ provides more cooperation concept representation capturing meaningful cooperation modes, and supports test-time concept interventions for detecting potential biases of cooperation mode and identifying spurious artifacts that impact cooperation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes