CLMay 5, 2014

Learning Bilingual Word Representations by Marginalizing Alignments

Tomáš Kočiský, Karl Moritz Hermann, Phil Blunsom

arXiv:1405.0947v183 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of improving cross-lingual natural language processing tasks, such as classification, by providing a more effective method for learning bilingual word representations, though it appears incremental as it builds on existing alignment-based approaches.

The authors tackled the problem of learning bilingual word representations by developing a probabilistic model that marginalizes over word alignments, capturing broader semantic context than previous methods. They demonstrated its advantage by outperforming the prior state of the art in a cross-lingual classification task.

We present a probabilistic model that simultaneously learns alignments and distributed representations for bilingual data. By marginalizing over word alignments the model captures a larger semantic context than prior work relying on hard alignments. The advantage of this approach is demonstrated in a cross-lingual classification task, where we outperform the prior published state of the art.

View on arXiv PDF

Similar