CLOct 7, 2017

OSU Multimodal Machine Translation System Report

arXiv:1710.02718v21098 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of improving translation accuracy in multimodal contexts for researchers, but it is incremental as it builds on existing tasks with a simple enhancement.

The paper tackled multimodal machine translation by incorporating images into both encoding and decoding, achieving the best performance in TER for English-German on the MSCOCO dataset.

This paper describes Oregon State University's submissions to the shared WMT'17 task "multimodal translation task I". In this task, all the sentence pairs are image captions in different languages. The key difference between this task and conventional machine translation is that we have corresponding images as additional information for each sentence pair. In this paper, we introduce a simple but effective system which takes an image shared between different languages, feeding it into the both encoding and decoding side. We report our system's performance for English-French and English-German with Flickr30K (in-domain) and MSCOCO (out-of-domain) datasets. Our system achieves the best performance in TER for English-German for MSCOCO dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes