Towards automating Codenames spymasters with deep reinforcement learning
This work addresses a gap in RL research for cooperative multiplayer and text-based games, though it is incremental as it builds on existing methods without achieving convergence.
The authors tackled the problem of applying reinforcement learning to the cooperative, text-based board game Codenames by formulating it as a Markov Decision Process and testing algorithms like SAC, PPO, and A2C, but none converged except in small, simplified environments.
Although most reinforcement learning research has centered on competitive games, little work has been done on applying it to co-operative multiplayer games or text-based games. Codenames is a board game that involves both asymmetric co-operation and natural language processing, which makes it an excellent candidate for advancing RL research. To my knowledge, this work is the first to formulate Codenames as a Markov Decision Process and apply some well-known reinforcement learning algorithms such as SAC, PPO, and A2C to the environment. Although none of the above algorithms converge for the Codenames environment, neither do they converge for a simplified environment called ClickPixel, except when the board size is small.