CLCVLGNEOct 15, 2015

Multilingual Image Description with Neural Sequence Models

arXiv:1510.04709v278 citations
Originality Incremental advance
AI Analysis

This work addresses the need for multilingual image descriptions, which is incremental as it builds on existing neural methods.

The paper tackled the problem of generating image descriptions in multiple languages by combining neural machine translation and image description techniques, achieving significant improvements in BLEU4 and Meteor scores over a monolingual baseline on the IAPR-TC12 dataset.

In this paper we present an approach to multi-language image description bringing together insights from neural machine translation and neural image description. To create a description of an image for a given target language, our sequence generation models condition on feature vectors from the image, the description from the source language, and/or a multimodal vector computed over the image and a description in the source language. In image description experiments on the IAPR-TC12 dataset of images aligned with English and German sentences, we find significant and substantial improvements in BLEU4 and Meteor scores for models trained over multiple languages, compared to a monolingual baseline.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes