Skill Check: Some Considerations on the Evaluation of Gamemastering Models for Role-playing Games
This work addresses the challenge of evaluating AI models for role-playing game gamemastering, which is incremental as it builds on existing interactive storytelling and NLP methods.
The paper tackled the problem of evaluating gamemastering models for role-playing games by proposing three test categories to assess their performance, and tested ChatGPT, Bard, and OpenAssistant, finding that they performed variably across these categories without specific numerical results.
In role-playing games a Game Master (GM) is the player in charge of the game, who must design the challenges the players face and narrate the outcomes of their actions. In this work we discuss some challenges to model GMs from an Interactive Storytelling and Natural Language Processing perspective. Following those challenges we propose three test categories to evaluate such dialogue systems, and we use them to test ChatGPT, Bard and OpenAssistant as out-of-the-box GMs.