A Pipeline for Creative Visual Storytelling
This addresses the need for more flexible and audience-adaptive storytelling in computational visual tasks, but it is incremental as it builds on existing cross-disciplinary approaches without demonstrating broad SOTA gains.
The paper tackles the problem of generating varied and adaptive textual stories from image sequences by proposing a pipeline of task-modules (Object Identification, Single-Image Inferencing, Multi-Image Narration) as a preliminary design for creative visual storytelling, with results including a piloted annotation task and corpus analysis.
Computational visual storytelling produces a textual description of events and interpretations depicted in a sequence of images. These texts are made possible by advances and cross-disciplinary approaches in natural language processing, generation, and computer vision. We define a computational creative visual storytelling as one with the ability to alter the telling of a story along three aspects: to speak about different environments, to produce variations based on narrative goals, and to adapt the narrative to the audience. These aspects of creative storytelling and their effect on the narrative have yet to be explored in visual storytelling. This paper presents a pipeline of task-modules, Object Identification, Single-Image Inferencing, and Multi-Image Narration, that serve as a preliminary design for building a creative visual storyteller. We have piloted this design for a sequence of images in an annotation task. We present and analyze the collected corpus and describe plans towards automation.