LGFeb 28, 2025

Digital Player: Evaluating Large Language Models based Human-like Agent in Games

Jiawei Wang, Kai Wang, Shaojie Lin, Runze Wu, Bihan Xu, Lingeng Jiang, Shiwei Zhao, Renyu Zhu, Haoyu Liu, Zhipeng Hu, Zhong Fan, Le Li

arXiv:2502.20807v14 citationsh-index: 10Has Code

Originality Synthesis-oriented

AI Analysis

This work provides a tool for researchers to study human-like AI agents in gaming environments, but it is incremental as it focuses on creating a testbed rather than advancing agent performance.

The paper tackles the challenge of evaluating LLM-based agents as human-like 'digital players' in complex strategy games like Unciv, by developing an open-source testbed to study their capabilities in decision-making and social interactions.

With the rapid advancement of Large Language Models (LLMs), LLM-based autonomous agents have shown the potential to function as digital employees, such as digital analysts, teachers, and programmers. In this paper, we develop an application-level testbed based on the open-source strategy game "Unciv", which has millions of active players, to enable researchers to build a "data flywheel" for studying human-like agents in the "digital players" task. This "Civilization"-like game features expansive decision-making spaces along with rich linguistic interactions such as diplomatic negotiations and acts of deception, posing significant challenges for LLM-based agents in terms of numerical reasoning and long-term planning. Another challenge for "digital players" is to generate human-like responses for social interaction, collaboration, and negotiation with human players. The open-source project can be found at https:/github.com/fuxiAIlab/CivAgent.

View on arXiv PDF Code

Similar