CVCLAug 4, 2021

Ordered Attention for Coherent Visual Storytelling

arXiv:2108.02180v310 citations
Originality Incremental advance
AI Analysis

This addresses the problem of generating coherent narratives from image sequences for applications like automated content creation, though it is incremental with a modest performance gain.

The paper tackles visual storytelling by generating coherent stories for image sequences, improving the METEOR score on the VIST dataset by 1% and enhancing coherency, focus, shareability, and image-groundedness in human evaluations.

We address the problem of visual storytelling, i.e., generating a story for a given sequence of images. While each sentence of the story should describe a corresponding image, a coherent story also needs to be consistent and relate to both future and past images. To achieve this we develop ordered image attention (OIA). OIA models interactions between the sentence-corresponding image and important regions in other images of the sequence. To highlight the important objects, a message-passing-like algorithm collects representations of those objects in an order-aware manner. To generate the story's sentences, we then highlight important image attention vectors with an Image-Sentence Attention (ISA). Further, to alleviate common linguistic mistakes like repetitiveness, we introduce an adaptive prior. The obtained results improve the METEOR score on the VIST dataset by 1%. In addition, an extensive human study verifies coherency improvements and shows that OIA and ISA generated stories are more focused, shareable, and image-grounded.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes