AIDec 24, 2020

LCEval: Learned Composite Metric for Caption Evaluation

arXiv:2012.13136v12 citations
AI Analysis

This work addresses the problem of inaccurate caption-level evaluation for developers of captioning systems, offering a more reliable metric.

This paper introduces LCEval, a neural network-based learned metric designed to improve caption-level evaluation of captioning systems. The proposed metric demonstrates superior caption-level correlation and strong system-level correlation with human assessments, outperforming existing metrics.

Automatic evaluation metrics hold a fundamental importance in the development and fine-grained analysis of captioning systems. While current evaluation metrics tend to achieve an acceptable correlation with human judgements at the system level, they fail to do so at the caption level. In this work, we propose a neural network-based learned metric to improve the caption-level caption evaluation. To get a deeper insight into the parameters which impact a learned metrics performance, this paper investigates the relationship between different linguistic features and the caption-level correlation of the learned metrics. We also compare metrics trained with different training examples to measure the variations in their evaluation. Moreover, we perform a robustness analysis, which highlights the sensitivity of learned and handcrafted metrics to various sentence perturbations. Our empirical analysis shows that our proposed metric not only outperforms the existing metrics in terms of caption-level correlation but it also shows a strong system-level correlation against human assessments.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes