AIMar 29, 2021

Boosting the Speed of Entity Alignment 10*: Dual Attention Matching Network with Normalized Hard Sample Mining

Xin Mao, Wenting Wang, Yuanbin Wu, Man Lan

arXiv:2103.15452v122.6152 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses scalability issues in knowledge graph integration, which is crucial for applications like data fusion, but it is incremental as it builds on existing methods.

The paper tackles the problem of slow and inefficient entity alignment in knowledge graphs by proposing a new encoder and loss function, achieving at least 10x faster processing on a benchmark dataset and improving Hits@1 and MRR by 6% to 13%.

Seeking the equivalent entities among multi-source Knowledge Graphs (KGs) is the pivotal step to KGs integration, also known as \emph{entity alignment} (EA). However, most existing EA methods are inefficient and poor in scalability. A recent summary points out that some of them even require several days to deal with a dataset containing 200,000 nodes (DWY100K). We believe over-complex graph encoder and inefficient negative sampling strategy are the two main reasons. In this paper, we propose a novel KG encoder -- Dual Attention Matching Network (Dual-AMN), which not only models both intra-graph and cross-graph information smartly, but also greatly reduces computational complexity. Furthermore, we propose the Normalized Hard Sample Mining Loss to smoothly select hard negative samples with reduced loss shift. The experimental results on widely used public datasets indicate that our method achieves both high accuracy and high efficiency. On DWY100K, the whole running process of our method could be finished in 1,100 seconds, at least 10* faster than previous work. The performances of our method also outperform previous works across all datasets, where Hits@1 and MRR have been improved from 6% to 13%.

View on arXiv PDF Code

Similar