LGFeb 26

Multi-agent imitation learning with function approximation: Linear Markov games and beyond

arXiv:2602.22810v11 citationsh-index: 60
Originality Highly original
AI Analysis

This work addresses the theoretical understanding and practical efficiency of multi-agent imitation learning for researchers and practitioners working with multi-agent systems, offering a more efficient interactive algorithm.

This paper provides the first theoretical analysis of multi-agent imitation learning (MAIL) in linear Markov games, showing that a feature-level concentrability coefficient can replace a larger state-action level coefficient. Additionally, they propose the first computationally efficient interactive MAIL algorithm for linear Markov games, with sample complexity dependent only on feature dimension $d$, and demonstrate its superior performance over behavioral cloning (BC) in games like Tic-Tac-Toe and Connect4.

In this work, we present the first theoretical analysis of multi-agent imitation learning (MAIL) in linear Markov games where both the transition dynamics and each agent's reward function are linear in some given features. We demonstrate that by leveraging this structure, it is possible to replace the state-action level "all policy deviation concentrability coefficient" (Freihaut et al., arXiv:2510.09325) with a concentrability coefficient defined at the feature level which can be much smaller than the state-action analog when the features are informative about states' similarity. Furthermore, to circumvent the need for any concentrability coefficient, we turn to the interactive setting. We provide the first, computationally efficient, interactive MAIL algorithm for linear Markov games and show that its sample complexity depends only on the dimension of the feature map $d$. Building on these theoretical findings, we propose a deep MAIL interactive algorithm which clearly outperforms BC on games such as Tic-Tac-Toe and Connect4.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes