AIFeb 16, 2025Code
Hierarchical Expert Prompt for Large-Language-Model: An Approach Defeat Elite AI in TextStarCraft II for the First TimeZongyuan Li, Chang Lu, Xiaojie Xu et al.
Since the emergence of the Large Language Model (LLM), LLM has been widely used in fields such as writing, translating, and searching. However, there is still great potential for LLM-based methods in handling complex tasks such as decision-making in the StarCraft II environment. To address problems such as lack of relevant knowledge and poor control over subtasks of varying importance, we propose a Hierarchical Expert Prompt (HEP) for LLM. Our method improves the understanding of game situations through expert-level tactical knowledge, improving the processing quality of tasks of varying importance through a hierarchical framework. Our approach defeated the highest level (Elite) standard built-in agent in TextStarCraft II for the first time and consistently outperformed the baseline method in other difficulties. Our experiments suggest that the proposed method is a practical solution for tackling complex decision-making challenges. The replay video can be viewed on https://www.bilibili.com/video/BV1uz42187EF and https://youtu.be/dO3PshWLV5M, and our codes have been open-sourced on https://github.com/luchang1113/HEP-LLM-play-StarCraftII.
AINov 8, 2024
LLM-PySC2: Starcraft II learning environment for Large Language ModelsZongyuan Li, Yanan Ni, Runnan Qi et al.
The tremendous potential has been demonstrated by large language models (LLMs) in intelligent decision-making problems, with unprecedented capabilities shown across diverse applications ranging from gaming AI systems to complex strategic planning frameworks. However, the StarCraft II platform, which has been widely adopted for validating decision-making algorithms in the past decade, has not yet provided substantial support for this emerging domain. To address issues that LLMs cannot interface with the hundreds of actions of the pysc2 backend and the lack of native support for multi-agent (MA) collaboration, we propose the LLM-PySC2 environment. This is the first environment that offers LLMs the complete pysc2 action space with sufficient multi-modal information and game Wiki knowledge. With an asynchronous query architecture, the environment efficiently interacts with LLMs that maintain a constant latency regardless of the scale of the agents' population. In the experiments, we evaluated LLMs' decision-making performance in both the macro-decision and micro-operation scenarios, with traditional StarCraft II Multi-Agent Challenge (SMAC) tasks and a series of new proposed. Results indicate that LLMs possess the potential to achieve victories in complex scenarios but cannot constantly generate correct decisions, especially in the recovered pysc2 action space and MA settings. Without task-relevant instructions, the pre-trained models suffer from issues such as hallucinations and inefficient collaboration. Our findings suggest that StarCraft II still challenges in the era of large models, revealing that there is a lot to do to develop an advanced LLM decision-making system, and the proposed LLM-PySC2 environment will support future development of LLM-based decision-making solutions.
AIFeb 19, 2025
Reflection of Episodes: Learning to Play Game from Expert and Self ExperiencesXiaojie Xu, Zongyuan Li, Chang Lu et al.
StarCraft II is a complex and dynamic real-time strategy (RTS) game environment, which is very suitable for artificial intelligence and reinforcement learning research. To address the problem of Large Language Model(LLM) learning in complex environments through self-reflection, we propose a Reflection of Episodes(ROE) framework based on expert experience and self-experience. This framework first obtains key information in the game through a keyframe selection method, then makes decisions based on expert experience and self-experience. After a game is completed, it reflects on the previous experience to obtain new self-experience. Finally, in the experiment, our method beat the robot under the Very Hard difficulty in TextStarCraft II. We analyze the data of the LLM in the process of the game in detail, verified its effectiveness.