CVAIHCMar 19

Do Vision Language Models Understand Human Engagement in Games?

arXiv:2603.1848052.21 citationsh-index: 4
AI Analysis

This work addresses the problem of automated engagement inference for game design and player-experience research, but it is incremental as it highlights limitations in existing VLMs rather than proposing a new solution.

The study evaluated whether vision-language models (VLMs) can infer human engagement from gameplay video, finding that zero-shot predictions were weak and often outperformed by simple baselines, with improvements from retrieval-augmented prompting limited to pointwise prediction and theory-guided prompting ineffective.

Inferring human engagement from gameplay video is important for game design and player-experience research, yet it remains unclear whether vision--language models (VLMs) can infer such latent psychological states from visual cues alone. Using the GameVibe Few-Shot dataset across nine first-person shooter games, we evaluate three VLMs under six prompting strategies, including zero-shot prediction, theory-guided prompts grounded in Flow, GameFlow, Self-Determination Theory, and MDA, and retrieval-augmented prompting. We consider both pointwise engagement prediction and pairwise prediction of engagement change between consecutive windows. Results show that zero-shot VLM predictions are generally weak and often fail to outperform simple per-game majority-class baselines. Memory- or retrieval-augmented prompting improves pointwise prediction in some settings, whereas pairwise prediction remains consistently difficult across strategies. Theory-guided prompting alone does not reliably help and can instead reinforce surface-level shortcuts. These findings suggest a perception--understanding gap in current VLMs: although they can recognize visible gameplay cues, they still struggle to robustly infer human engagement across games.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes