CL CV LGDec 8, 2020

Globetrotter: Connecting Languages by Connecting Images

Dídac Surís, Dave Epstein, Carl Vondrick

arXiv:2012.04631v41.19 citationsh-index: 32Has Code

Originality Highly original

AI Analysis

This work is significant for researchers and practitioners in machine translation, offering a novel approach to overcome data scarcity for low-resource language pairs by using visual information.

This paper addresses the challenge of machine translation between many languages without requiring parallel corpora for all pairs. It proposes a method that aligns text segments across different languages by leveraging the similarity of their associated images, outperforming previous unsupervised word and sentence translation methods.

Machine translation between many languages at once is highly challenging, since training with ground truth requires supervision between all language pairs, which is difficult to obtain. Our key insight is that, while languages may vary drastically, the underlying visual appearance of the world remains consistent. We introduce a method that uses visual observations to bridge the gap between languages, rather than relying on parallel corpora or topological properties of the representations. We train a model that aligns segments of text from different languages if and only if the images associated with them are similar and each image in turn is well-aligned with its textual description. We train our model from scratch on a new dataset of text in over fifty languages with accompanying images. Experiments show that our method outperforms previous work on unsupervised word and sentence translation using retrieval. Code, models and data are available on globetrotter.cs.columbia.edu.

View on arXiv PDF Code

Similar