CVCLSep 16, 2022

Belief Revision based Caption Re-ranker with Visual Semantic Information

arXiv:2209.08163v1580 citationsh-index: 48
Originality Incremental advance
AI Analysis

This work addresses the need for better caption accuracy in image-captioning systems, but it is incremental as it builds on existing re-ranking and belief revision methods.

The paper tackles the problem of improving captions from image-caption generation systems by proposing a re-ranking approach that uses visual-semantic measures and the Belief Revision framework to select the best caption, resulting in enhanced performance without additional training.

In this work, we focus on improving the captions generated by image-caption generation systems. We propose a novel re-ranking approach that leverages visual-semantic measures to identify the ideal caption that maximally captures the visual information in the image. Our re-ranker utilizes the Belief Revision framework (Blok et al., 2003) to calibrate the original likelihood of the top-n captions by explicitly exploiting the semantic relatedness between the depicted caption and the visual context. Our experiments demonstrate the utility of our approach, where we observe that our re-ranker can enhance the performance of a typical image-captioning system without the necessity of any additional training or fine-tuning.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes