CLMay 4, 2022

Cross-lingual Word Embeddings in Hyperbolic Space

arXiv:2205.01907v24 citationsh-index: 52
Originality Incremental advance
AI Analysis

This is an incremental improvement for natural language processing applications requiring cross-lingual embeddings, as it enhances representation of hierarchical structure across languages.

The paper tackled the problem of learning cross-lingual word embeddings by adapting Word2Vec to hyperbolic space using the Poincaré ball model, and found that it captures hierarchical relationships better than Euclidean-based methods, achieving comparable performance on analogy tasks.

Cross-lingual word embeddings can be applied to several natural language processing applications across multiple languages. Unlike prior works that use word embeddings based on the Euclidean space, this short paper presents a simple and effective cross-lingual Word2Vec model that adapts to the Poincaré ball model of hyperbolic space to learn unsupervised cross-lingual word representations from a German-English parallel corpus. It has been shown that hyperbolic embeddings can capture and preserve hierarchical relationships. We evaluate the model on both hypernymy and analogy tasks. The proposed model achieves comparable performance with the vanilla Word2Vec model on the cross-lingual analogy task, the hypernymy task shows that the cross-lingual Poincaré Word2Vec model can capture latent hierarchical structure from free text across languages, which are absent from the Euclidean-based Word2Vec representations. Our results show that by preserving the latent hierarchical information, hyperbolic spaces can offer better representations for cross-lingual embeddings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes