AI MAMay 30

Stop Wandering, Find the Keys: LLMs Discriminate Key States for Efficient Multi-Agent Exploration

Yun Qu, Boyuan Wang, Yuhang Jiang, Jianzhun Shao, Yixiu Mao, Heming Zou, Chang Liu, Cheems Wang, Meiqin Liu, Xiangyang Ji

Tsinghua

arXiv:2410.0251180.113 citationsh-index: 11

AI Analysis

For multi-agent RL practitioners, LEMAE reduces redundant exploration by leveraging LLM guidance, offering significant efficiency gains.

LEMAE uses LLMs to identify key states for multi-agent exploration, achieving a 10x acceleration and outperforming SOTA on SMAC and MPE benchmarks.

With expansive state-action spaces, efficient multi-agent exploration remains a longstanding challenge in reinforcement learning. Although pursuing novelty, diversity, or uncertainty attracts increasing attention, redundant efforts brought by exploration without proper guidance choices poses a practical issue for the community. This paper introduces a systematic approach, termed LEMAE, choosing to channel informative task-relevant guidance from a knowledgeable Large Language Model (LLM) for Efficient Multi-Agent Exploration. Specifically, we ground linguistic knowledge from LLM into symbolic key states, that are critical for task fulfillment, in a discriminative manner at low LLM inference costs. To unleash the power of key states, we design Subspace-based Hindsight Intrinsic Reward (SHIR) to guide agents toward key states by increasing reward density. Additionally, we build the Key State Memory Tree (KSMT) to track transitions between key states in a specific task for organized exploration. Benefiting from diminishing redundant explorations, LEMAE outperforms existing SOTA approaches on the challenging benchmarks (e.g., SMAC and MPE) by a large margin, achieving a 10x acceleration in certain scenarios.

View on arXiv PDF

Similar