CLAISep 21, 2020

Generative Imagination Elevates Machine Translation

arXiv:2009.09654v2730 citations
Originality Incremental advance
AI Analysis

This addresses translation accuracy for users by reducing reliance on paired image data, though it is incremental over existing multimodal methods.

The paper tackles machine translation by generating visual representations from source sentences to improve translation quality, showing that ImagiT significantly outperforms text-only baselines.

There are common semantics shared across text and images. Given a sentence in a source language, whether depicting the visual scene helps translation into a target language? Existing multimodal neural machine translation methods (MNMT) require triplets of bilingual sentence - image for training and tuples of source sentence - image for inference. In this paper, we propose ImagiT, a novel machine translation method via visual imagination. ImagiT first learns to generate visual representation from the source sentence, and then utilizes both source sentence and the "imagined representation" to produce a target translation. Unlike previous methods, it only needs the source sentence at the inference time. Experiments demonstrate that ImagiT benefits from visual imagination and significantly outperforms the text-only neural machine translation baselines. Further analysis reveals that the imagination process in ImagiT helps fill in missing information when performing the degradation strategy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes