Dungeon Crawl Stone Soup as an Evaluation Domain for Artificial Intelligence
It introduces a new evaluation domain for AI researchers, but is incremental as it builds on existing game APIs like MALMO and Starcraft II.
The paper proposes using Dungeon Crawl Stone Soup, a complex rogue-like video game, as a testbed for evaluating AI and cognitive systems, highlighting its properties and an ongoing API development effort.
Dungeon Crawl Stone Soup is a popular, single-player, free and open-source rogue-like video game with a sufficiently complex decision space that makes it an ideal testbed for research in cognitive systems and, more generally, artificial intelligence. This paper describes the properties of Dungeon Crawl Stone Soup that are conducive to evaluating new approaches of AI systems. We also highlight an ongoing effort to build an API for AI researchers in the spirit of recent game APIs such as MALMO, ELF, and the Starcraft II API. Dungeon Crawl Stone Soup's complexity offers significant opportunities for evaluating AI and cognitive systems, including human user studies. In this paper we provide (1) a description of the state space of Dungeon Crawl Stone Soup, (2) a description of the components for our API, and (3) the potential benefits of evaluating AI agents in the Dungeon Crawl Stone Soup video game.