CLAISep 20, 2024

Generating Visual Stories with Grounded and Coreferent Characters

arXiv:2409.13555v22 citationsh-index: 86
Originality Incremental advance
AI Analysis

This addresses the issue of vague or incorrect character references in visual storytelling for applications like automated narrative generation, though it is incremental as it builds on existing benchmarks.

The paper tackles the problem of generic visual storytelling by introducing character-centric story generation, resulting in a model that produces stories with more consistent and coreferent character mentions compared to baselines and state-of-the-art systems.

Characters are important in narratives. They move the plot forward, create emotional connections, and embody the story's themes. Visual storytelling methods focus more on the plot and events relating to it, without building the narrative around specific characters. As a result, the generated stories feel generic, with character mentions being absent, vague, or incorrect. To mitigate these issues, we introduce the new task of character-centric story generation and present the first model capable of predicting visual stories with consistently grounded and coreferent character mentions. Our model is finetuned on a new dataset which we build on top of the widely used VIST benchmark. Specifically, we develop an automated pipeline to enrich VIST with visual and textual character coreference chains. We also propose new evaluation metrics to measure the richness of characters and coreference in stories. Experimental results show that our model generates stories with recurring characters which are consistent and coreferent to larger extent compared to baselines and state-of-the-art systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes