CLCVMay 2, 2016

Multi30K: Multilingual English-German Image Descriptions

arXiv:1605.00459v1652 citations
Originality Synthesis-oriented
AI Analysis

This dataset enables multilingual multimodal research, such as image description and machine translation, but is incremental as it builds on existing data.

The authors introduced the Multi30K dataset to address the lack of multilingual resources in image description research, extending Flickr30K with German translations and crowdsourced descriptions.

We introduce the Multi30K dataset to stimulate multilingual multimodal research. Recent advances in image description have been demonstrated on English-language datasets almost exclusively, but image description should not be limited to English. This dataset extends the Flickr30K dataset with i) German translations created by professional translators over a subset of the English descriptions, and ii) descriptions crowdsourced independently of the original English descriptions. We outline how the data can be used for multilingual image description and multimodal machine translation, but we anticipate the data will be useful for a broader range of tasks.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes