Language classification from bilingual word embedding graphs
This work addresses the role of second languages in bilingual embeddings for semantic tasks, but it appears incremental as it builds on existing embedding methods without introducing major new techniques.
The study investigated how the second language in bilingual word embeddings affects monolingual semantic tasks, finding positive correlations between task performance and second language similarity, and demonstrated their use for semantic language classification, with results supporting that semantic similarity is influenced by structural and geographical factors.
We study the role of the second language in bilingual word embeddings in monolingual semantic evaluation tasks. We find strongly and weakly positive correlations between down-stream task performance and second language similarity to the target language. Additionally, we show how bilingual word embeddings can be employed for the task of semantic language classification and that joint semantic spaces vary in meaningful ways across second languages. Our results support the hypothesis that semantic language similarity is influenced by both structural similarity as well as geography/contact.