LGCLMLDec 3, 2019

A Robust Self-Learning Method for Fully Unsupervised Cross-Lingual Mappings of Word Embeddings: Making the Method Robustly Reproducible as Well

arXiv:1912.01706v21000 citations
Originality Synthesis-oriented
AI Analysis

This work addresses reproducibility and robustness in cross-lingual NLP research, but it is incremental as it builds on prior methods.

The paper reproduces a robust self-learning method for unsupervised cross-lingual word embedding mappings, confirming feasibility with minor assumptions and testing robustness on four new languages less similar to English, while conducting a grid search for stability.

In this paper, we reproduce the experiments of Artetxe et al. (2018b) regarding the robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings. We show that the reproduction of their method is indeed feasible with some minor assumptions. We further investigate the robustness of their model by introducing four new languages that are less similar to English than the ones proposed by the original paper. In order to assess the stability of their model, we also conduct a grid search over sensible hyperparameters. We then propose key recommendations applicable to any research project in order to deliver fully reproducible research.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes