CLSep 18, 2017

Limitations of Cross-Lingual Learning from Image Search

arXiv:1709.05914v139.31101 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the limitation of cross-lingual learning for non-noun parts-of-speech, which is incremental as it builds on existing methods but reveals scalability issues.

The paper investigated whether cross-lingual representations for adjectives and verbs can be learned from image search data, similar to prior work on nouns, and found that this approach does not scale beyond simple nouns across five language pairs.

Cross-lingual representation learning is an important step in making NLP scale to all the world's languages. Recent work on bilingual lexicon induction suggests that it is possible to learn cross-lingual representations of words based on similarities between images associated with these words. However, that work focused on the translation of selected nouns only. In our work, we investigate whether the meaning of other parts-of-speech, in particular adjectives and verbs, can be learned in the same way. We also experiment with combining the representations learned from visual data with embeddings learned from textual data. Our experiments across five language pairs indicate that previous work does not scale to the problem of learning cross-lingual representations beyond simple nouns.

View on arXiv PDF

Similar