Character-Centric Storytelling
This addresses a specific limitation in visual storytelling for applications like automated content creation, but it appears incremental as it builds on existing models without claiming major breakthroughs.
The paper tackles the problem of generating stories from image sequences that account for all characters, proposing a model that learns character relationships to include them in narratives, using the VIST dataset and reporting statistics.
Sequential vision-to-language or visual storytelling has recently been one of the areas of focus in computer vision and language modeling domains. Though existing models generate narratives that read subjectively well, there could be cases when these models miss out on generating stories that account and address all prospective human and animal characters in the image sequences. Considering this scenario, we propose a model that implicitly learns relationships between provided characters and thereby generates stories with respective characters in scope. We use the VIST dataset for this purpose and report numerous statistics on the dataset. Eventually, we describe the model, explain the experiment and discuss our current status and future work.