Cross-lingual Entity Alignment via Joint Attribute-Preserving Embedding
This addresses the challenge of aligning entities across languages for knowledge base integration, but it is incremental as it builds on existing embedding techniques by incorporating attributes.
The paper tackles the problem of cross-lingual entity alignment in knowledge bases by proposing a joint attribute-preserving embedding model that leverages attribute correlations, and it significantly outperforms state-of-the-art embedding approaches on real-world datasets.
Entity alignment is the task of finding entities in two knowledge bases (KBs) that represent the same real-world object. When facing KBs in different natural languages, conventional cross-lingual entity alignment methods rely on machine translation to eliminate the language barriers. These approaches often suffer from the uneven quality of translations between languages. While recent embedding-based techniques encode entities and relationships in KBs and do not need machine translation for cross-lingual entity alignment, a significant number of attributes remain largely unexplored. In this paper, we propose a joint attribute-preserving embedding model for cross-lingual entity alignment. It jointly embeds the structures of two KBs into a unified vector space and further refines it by leveraging attribute correlations in the KBs. Our experimental results on real-world datasets show that this approach significantly outperforms the state-of-the-art embedding approaches for cross-lingual entity alignment and could be complemented with methods based on machine translation.