CVSep 7, 2021

Journalistic Guidelines Aware News Image Captioning

Xuewen Yang, Svebor Karaman, Joel Tetreault, Alex Jaimes

arXiv:2109.02865v250.1665 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the need for more accurate and context-aware image captions in journalism, though it appears incremental as it builds on existing captioning methods with domain-specific adaptations.

The paper tackled the problem of generating captions for news images that adhere to journalistic guidelines and incorporate named entities, proposing JoGANIC, which substantially outperformed state-of-the-art methods on two large-scale datasets.

The task of news article image captioning aims to generate descriptive and informative captions for news article images. Unlike conventional image captions that simply describe the content of the image in general terms, news image captions follow journalistic guidelines and rely heavily on named entities to describe the image content, often drawing context from the whole article they are associated with. In this work, we propose a new approach to this task, motivated by caption guidelines that journalists follow. Our approach, Journalistic Guidelines Aware News Image Captioning (JoGANIC), leverages the structure of captions to improve the generation quality and guide our representation design. Experimental results, including detailed ablation studies, on two large-scale publicly available datasets show that JoGANIC substantially outperforms state-of-the-art methods both on caption generation and named entity related metrics.

View on arXiv PDF Code

Similar