CLDec 5, 2017

Neural Cross-Lingual Entity Linking

Avirup Sil, Gourab Kundu, Radu Florian, Wael Hamza

arXiv:1712.01813v111.9105 citationsh-index: 29

Originality Highly original

AI Analysis

This work addresses the problem of linking non-English mentions to English Wikipedia for improved information access, representing an incremental advance with strong specific gains.

The paper tackles the challenge of cross-lingual entity linking by proposing a neural model that uses fine-grained similarities and multi-lingual embeddings, achieving state-of-the-art results on English, Spanish, and Chinese TAC 2015 datasets.

A major challenge in Entity Linking (EL) is making effective use of contextual information to disambiguate mentions to Wikipedia that might refer to different entities in different contexts. The problem exacerbates with cross-lingual EL which involves linking mentions written in non-English documents to entries in the English Wikipedia: to compare textual clues across languages we need to compute similarity between textual fragments across languages. In this paper, we propose a neural EL model that trains fine-grained similarities and dissimilarities between the query and candidate document from multiple perspectives, combined with convolution and tensor networks. Further, we show that this English-trained system can be applied, in zero-shot learning, to other languages by making surprisingly effective use of multi-lingual embeddings. The proposed system has strong empirical evidence yielding state-of-the-art results in English as well as cross-lingual: Spanish and Chinese TAC 2015 datasets.

View on arXiv PDF

Similar