CVSep 26, 2017

Understanding Infographics through Textual and Visual Tag Prediction

arXiv:1709.09215v141 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of understanding infographics for applications in information retrieval and content analysis, presenting an incremental method for visual hashtag discovery.

The paper tackles the problem of automatically extracting diagnostic visual elements (visual hashtags) from infographics by first predicting text tags from extracted text and then using them to localize key visual components, achieving performance comparable to human annotations on a dataset of 29K infographics across 26 categories and 391 tags.

We introduce the problem of visual hashtag discovery for infographics: extracting visual elements from an infographic that are diagnostic of its topic. Given an infographic as input, our computational approach automatically outputs textual and visual elements predicted to be representative of the infographic content. Concretely, from a curated dataset of 29K large infographic images sampled across 26 categories and 391 tags, we present an automated two step approach. First, we extract the text from an infographic and use it to predict text tags indicative of the infographic content. And second, we use these predicted text tags as a supervisory signal to localize the most diagnostic visual elements from within the infographic i.e. visual hashtags. We report performances on a categorization and multi-label tag prediction problem and compare our proposed visual hashtags to human annotations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes