CLAIJan 13

STAGE: A Benchmark for Knowledge Graph Construction, Question Answering, and In-Script Role-Playing over Movie Screenplays

arXiv:2601.08510v11 citations
Originality Incremental advance
AI Analysis

This provides a holistic evaluation framework for narrative understanding in AI, addressing a gap in prior benchmarks that focused on individual subtasks.

The authors tackled the problem of evaluating models' ability to construct coherent story worlds from movie screenplays and use them consistently across multiple reasoning and generation tasks, by introducing STAGE, a unified benchmark with four tasks (knowledge graph construction, event summarization, question answering, and role-playing) grounded in shared narrative representations for 150 films in English and Chinese.

Movie screenplays are rich long-form narratives that interleave complex character relationships, temporally ordered events, and dialogue-driven interactions. While prior benchmarks target individual subtasks such as question answering or dialogue generation, they rarely evaluate whether models can construct a coherent story world and use it consistently across multiple forms of reasoning and generation. We introduce STAGE (Screenplay Text, Agents, Graphs and Evaluation), a unified benchmark for narrative understanding over full-length movie screenplays. STAGE defines four tasks: knowledge graph construction, scene-level event summarization, long-context screenplay question answering, and in-script character role-playing, all grounded in a shared narrative world representation. The benchmark provides cleaned scripts, curated knowledge graphs, and event- and character-centric annotations for 150 films across English and Chinese, enabling holistic evaluation of models' abilities to build world representations, abstract and verify narrative events, reason over long narratives, and generate character-consistent responses.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes