Learning Strategy Representation for Imitation Learning in Multi-Agent Games
This addresses the challenge of preventing undesirable behavior learning in multi-agent imitation learning, though it appears incremental as a plug-in method for existing algorithms.
The paper tackles the problem of imitation learning in multi-agent games where offline datasets contain diverse strategies, by introducing the STRIL framework to learn strategy representations and filter sub-optimal data, resulting in significant performance enhancements across competitive environments like Two-player Pong, Limit Texas Hold'em, and Connect Four.
The offline datasets for imitation learning (IL) in multi-agent games typically contain player trajectories exhibiting diverse strategies, which necessitate measures to prevent learning algorithms from acquiring undesirable behaviors. Learning representations for these trajectories is an effective approach to depicting the strategies employed by each demonstrator. However, existing learning strategies often require player identification or rely on strong assumptions, which are not appropriate for multi-agent games. Therefore, in this paper, we introduce the Strategy Representation for Imitation Learning (STRIL) framework, which (1) effectively learns strategy representations in multi-agent games, (2) estimates proposed indicators based on these representations, and (3) filters out sub-optimal data using the indicators. STRIL is a plug-in method that can be integrated into existing IL algorithms. We demonstrate the effectiveness of STRIL across competitive multi-agent scenarios, including Two-player Pong, Limit Texas Hold'em, and Connect Four. Our approach successfully acquires strategy representations and indicators, thereby identifying dominant trajectories and significantly enhancing existing IL performance across these environments.