CLAug 31, 2018

Generalizing Procrustes Analysis for Better Bilingual Dictionary Induction

arXiv:1809.00064v31104 citations
Originality Incremental advance
AI Analysis

This is an incremental improvement for natural language processing tasks involving multilingual word embeddings.

The paper tackles the problem of bilingual dictionary induction by proposing a modified approach that projects languages onto a latent space instead of directly aligning them, which improves performance in low-resource settings.

Most recent approaches to bilingual dictionary induction find a linear alignment between the word vector spaces of two languages. We show that projecting the two languages onto a third, latent space, rather than directly onto each other, while equivalent in terms of expressivity, makes it easier to learn approximate alignments. Our modified approach also allows for supporting languages to be included in the alignment process, to obtain an even better performance in low resource settings.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes