Semi-automatic WordNet Linking using Word Embeddings
This work addresses the costly and tedious manual maintenance of linked wordnets for NLP applications, offering an incremental improvement to aid human experts.
The paper tackles the problem of manually linking wordnets across languages by proposing a semi-automatic approach that uses word embeddings to rank candidate synsets, achieving retrieval of the correct synset in the top 10 for 60% of all synsets and 70% of noun synsets.
Wordnets are rich lexico-semantic resources. Linked wordnets are extensions of wordnets, which link similar concepts in wordnets of different languages. Such resources are extremely useful in many Natural Language Processing (NLP) applications, primarily those based on knowledge-based approaches. In such approaches, these resources are considered as gold standard/oracle. Thus, it is crucial that these resources hold correct information. Thereby, they are created by human experts. However, manual maintenance of such resources is a tedious and costly affair. Thus techniques that can aid the experts are desirable. In this paper, we propose an approach to link wordnets. Given a synset of the source language, the approach returns a ranked list of potential candidate synsets in the target language from which the human expert can choose the correct one(s). Our technique is able to retrieve a winner synset in the top 10 ranked list for 60% of all synsets and 70% of noun synsets.