CVCLApr 1, 2015

Microsoft COCO Captions: Data Collection and Evaluation Server

arXiv:1504.00325v22896 citations
Originality Synthesis-oriented
AI Analysis

This provides a standardized benchmark for evaluating image captioning algorithms, addressing consistency issues in the field.

The paper introduces the Microsoft COCO Caption dataset, which includes over 1.5 million captions for 330,000 images, and an evaluation server that scores automatic caption generation using metrics like BLEU, METEOR, ROUGE, and CIDEr.

In this paper we describe the Microsoft COCO Caption dataset and evaluation server. When completed, the dataset will contain over one and a half million captions describing over 330,000 images. For the training and validation images, five independent human generated captions will be provided. To ensure consistency in evaluation of automatic caption generation algorithms, an evaluation server is used. The evaluation server receives candidate captions and scores them using several popular metrics, including BLEU, METEOR, ROUGE and CIDEr. Instructions for using the evaluation server are provided.

Code Implementations18 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes