CLJan 21, 2018

Embedding Learning Through Multilingual Concept Induction

Philipp Dufter, Mengjie Zhao, Martin Schmitt, Alexander Fraser, Hinrich Schütze

arXiv:1801.06807v332.01096 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of multilingual semantic representation for natural language processing applications, though it appears incremental as it builds on existing embedding methods.

The paper tackles the problem of learning vector representations of words across many languages by introducing a concept induction method, achieving better performance than previous approaches on crosslingual word similarity and sentiment analysis tasks.

We present a new method for estimating vector space representations of words: embedding learning by concept induction. We test this method on a highly parallel corpus and learn semantic representations of words in 1259 different languages in a single common space. An extensive experimental evaluation on crosslingual word similarity and sentiment analysis indicates that concept-based multilingual embedding learning performs better than previous approaches.

View on arXiv PDF

Similar