CVCLJun 7, 2019

Figure Captioning with Reasoning and Sequence-Level Training

arXiv:1906.02850v143 citations
Originality Highly original
AI Analysis

This work addresses the challenge of making figures accessible to computers for automatic processing, which is important for researchers and practitioners dealing with large collections of visual data.

The authors tackled the problem of automatically generating natural language descriptions for figures like bar charts and line plots by introducing a new dataset (FigCAP) and novel attention mechanisms (Label Maps Attention and Relation Maps Attention), combined with sequence-level training using reinforcement learning. Their method outperformed baselines, showing significant potential for captioning large repositories of figures.

Figures, such as bar charts, pie charts, and line plots, are widely used to convey important information in a concise format. They are usually human-friendly but difficult for computers to process automatically. In this work, we investigate the problem of figure captioning where the goal is to automatically generate a natural language description of the figure. While natural image captioning has been studied extensively, figure captioning has received relatively little attention and remains a challenging problem. First, we introduce a new dataset for figure captioning, FigCAP, based on FigureQA. Second, we propose two novel attention mechanisms. To achieve accurate generation of labels in figures, we propose Label Maps Attention. To model the relations between figure labels, we propose Relation Maps Attention. Third, we use sequence-level training with reinforcement learning in order to directly optimizes evaluation metrics, which alleviates the exposure bias issue and further improves the models in generating long captions. Extensive experiments show that the proposed method outperforms the baselines, thus demonstrating a significant potential for the automatic captioning of vast repositories of figures.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes