CLCVApr 29, 2020

Pragmatic Issue-Sensitive Image Captioning

arXiv:2004.14451v2998 citations
AI Analysis

This addresses the issue of improving caption relevance for users in AI-driven image understanding, though it is incremental as it builds on existing neural captioners.

The paper tackles the problem of image captioning systems being insensitive to communicative goals by proposing Issue-Sensitive Image Captioning (ISIC), which uses an extension of the Rational Speech Acts model to generate captions that resolve specified issues, resulting in captions that are both highly descriptive and issue-sensitive.

Image captioning systems have recently improved dramatically, but they still tend to produce captions that are insensitive to the communicative goals that captions should meet. To address this, we propose Issue-Sensitive Image Captioning (ISIC). In ISIC, a captioning system is given a target image and an issue, which is a set of images partitioned in a way that specifies what information is relevant. The goal of the captioner is to produce a caption that resolves this issue. To model this task, we use an extension of the Rational Speech Acts model of pragmatic language use. Our extension is built on top of state-of-the-art pretrained neural image captioners and explicitly reasons about issues in our sense. We establish experimentally that these models generate captions that are both highly descriptive and issue-sensitive, and we show how ISIC can complement and enrich the related task of Visual Question Answering.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes