CVLGMar 26, 2020

Egoshots, an ego-vision life-logging dataset and semantic fidelity metric to evaluate diversity in image captioning models

arXiv:2003.11743v26 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the issue of biased captions in daily life applications, but it is incremental as it builds on existing pre-trained networks and focuses on dataset and metric creation.

The paper tackles the problem of biased and limited information in image captioning models by introducing Egoshots, a dataset of 978 real-life images without captions, and proposing a new metric called Semantic Fidelity (SF) to evaluate caption quality without annotations.

Image captioning models have been able to generate grammatically correct and human understandable sentences. However most of the captions convey limited information as the model used is trained on datasets that do not caption all possible objects existing in everyday life. Due to this lack of prior information most of the captions are biased to only a few objects present in the scene, hence limiting their usage in daily life. In this paper, we attempt to show the biased nature of the currently existing image captioning models and present a new image captioning dataset, Egoshots, consisting of 978 real life images with no captions. We further exploit the state of the art pre-trained image captioning and object recognition networks to annotate our images and show the limitations of existing works. Furthermore, in order to evaluate the quality of the generated captions, we propose a new image captioning metric, object based Semantic Fidelity (SF). Existing image captioning metrics can evaluate a caption only in the presence of their corresponding annotations; however, SF allows evaluating captions generated for images without annotations, making it highly useful for real life generated captions.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes