Toward Degree Bias in Embedding-Based Knowledge Graph Completion
This addresses bias issues in knowledge graph completion for AI applications, representing an incremental improvement with a novel method for a known bottleneck.
The paper tackles the problem of degree bias in embedding-based knowledge graph completion, validating its existence and identifying key factors, and introduces KG-Mixup, a data augmentation method that improves various KGC methods and outperforms others on benchmark datasets.
A fundamental task for knowledge graphs (KGs) is knowledge graph completion (KGC). It aims to predict unseen edges by learning representations for all the entities and relations in a KG. A common concern when learning representations on traditional graphs is degree bias. It can affect graph algorithms by learning poor representations for lower-degree nodes, often leading to low performance on such nodes. However, there has been limited research on whether there exists degree bias for embedding-based KGC and how such bias affects the performance of KGC. In this paper, we validate the existence of degree bias in embedding-based KGC and identify the key factor to degree bias. We then introduce a novel data augmentation method, KG-Mixup, to generate synthetic triples to mitigate such bias. Extensive experiments have demonstrated that our method can improve various embedding-based KGC methods and outperform other methods tackling the bias problem on multiple benchmark datasets.