CLAICVNov 11, 2019

Keep it Consistent: Topic-Aware Storytelling from an Image Stream via Iterative Multi-agent Communication

arXiv:1911.04192v2993 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of producing coherent narratives for visual storytelling, which is important for applications in automated content creation and assistive technologies, representing an incremental improvement over existing approaches.

The paper tackles the problem of generating semantically incoherent stories from image sequences by introducing a topic description task to detect global context and using a multi-agent communication framework to guide story generation. The method achieves higher quality stories on the VIST dataset compared to state-of-the-art methods, as shown by quantitative results and human evaluation.

Visual storytelling aims to generate a narrative paragraph from a sequence of images automatically. Existing approaches construct text description independently for each image and roughly concatenate them as a story, which leads to the problem of generating semantically incoherent content. In this paper, we propose a new way for visual storytelling by introducing a topic description task to detect the global semantic context of an image stream. A story is then constructed with the guidance of the topic description. In order to combine the two generation tasks, we propose a multi-agent communication framework that regards the topic description generator and the story generator as two agents and learn them simultaneously via iterative updating mechanism. We validate our approach on VIST dataset, where quantitative results, ablations, and human evaluation demonstrate our method's good ability in generating stories with higher quality compared to state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes