CVJan 25, 2019

Improving Image Captioning by Leveraging Knowledge Graphs

arXiv:1901.08942v164 citations
Originality Incremental advance
AI Analysis

This improves image captioning for applications requiring detailed descriptions, though it appears incremental as it enhances existing methods rather than introducing a new paradigm.

The paper tackles the problem of generating better image captions by incorporating knowledge graphs to augment information from images, showing that this approach substantially outperforms state-of-the-art methods that rely only on images on benchmark datasets like MS COCO as measured by CIDEr-D.

We explore the use of a knowledge graphs, that capture general or commonsense knowledge, to augment the information extracted from images by the state-of-the-art methods for image captioning. The results of our experiments, on several benchmark data sets such as MS COCO, as measured by CIDEr-D, a performance metric for image captioning, show that the variants of the state-of-the-art methods for image captioning that make use of the information extracted from knowledge graphs can substantially outperform those that rely solely on the information extracted from images.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes