Plot and Rework: Modeling Storylines for Visual Storytelling
This work addresses the challenge of automated visual storytelling for applications like creative writing or content generation, but it is incremental as it builds on existing methods by incorporating external knowledge and iterative processes.
The paper tackled the problem of generating coherent and engaging stories from image sequences in visual storytelling by introducing PR-VIST, a framework that models storylines using a story graph and iterative training, resulting in stories that are superior in diversity, coherence, and humanness as shown in evaluations.
Writing a coherent and engaging story is not easy. Creative writers use their knowledge and worldview to put disjointed elements together to form a coherent storyline, and work and rework iteratively toward perfection. Automated visual storytelling (VIST) models, however, make poor use of external knowledge and iterative generation when attempting to create stories. This paper introduces PR-VIST, a framework that represents the input image sequence as a story graph in which it finds the best path to form a storyline. PR-VIST then takes this path and learns to generate the final story via an iterative training process. This framework produces stories that are superior in terms of diversity, coherence, and humanness, per both automatic and human evaluations. An ablation study shows that both plotting and reworking contribute to the model's superiority.