CLSep 1, 2025

Can Large Language Models Master Complex Card Games?

Wei Wang, Fuqing Bie, Junzhe Chen, Dan Zhang, Shiyu Huang, Evgeny Kharlamov, Jie Tang

arXiv:2509.01328v56.72 citationsh-index: 13Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of applying LLMs to complex game benchmarks, showing incremental progress in adapting them to specific domains like card games.

The paper tackles the problem of whether large language models (LLMs) can master complex card games, finding that they can approach strong AI performance through fine-tuning on high-quality data, achieve proficiency in multiple games simultaneously with performance variations based on rule similarity, and mitigate declines in general capabilities by integrating general instruction data.

Complex games have long been an important benchmark for testing the progress of artificial intelligence algorithms. AlphaGo, AlphaZero, and MuZero have defeated top human players in Go and Chess, garnering widespread societal attention towards artificial intelligence. Concurrently, large language models (LLMs) have exhibited remarkable capabilities across various tasks, raising the question of whether LLMs can achieve similar success in complex games. In this paper, we explore the potential of LLMs in mastering complex card games. We systematically assess the learning capabilities of LLMs across eight diverse card games, evaluating the impact of fine-tuning on high-quality gameplay data, and examining the models' ability to retain general capabilities while mastering these games. Our findings indicate that: (1) LLMs can approach the performance of strong game AIs through supervised fine-tuning on high-quality data, (2) LLMs can achieve a certain level of proficiency in multiple complex card games simultaneously, with performance augmentation for games with similar rules and conflicts for dissimilar ones, and (3) LLMs experience a decline in general capabilities when mastering complex games, but this decline can be mitigated by integrating a certain amount of general instruction data. The evaluation results demonstrate strong learning ability and versatility of LLMs. The code is available at https://github.com/THUDM/LLM4CardGame

View on arXiv PDF Code

Similar