KGEx: Explaining Knowledge Graph Embeddings via Subgraph Sampling and Knowledge Distillation
This work addresses the interpretability gap for users of knowledge graph embeddings, though it is incremental as it builds on existing surrogate model techniques.
The paper tackles the problem of interpreting link predictions in knowledge graph embeddings by introducing KGEx, a post-hoc method that identifies important training triples through surrogate models and knowledge distillation, achieving faithful explanations as demonstrated on two public datasets.
Despite being the go-to choice for link prediction on knowledge graphs, research on interpretability of knowledge graph embeddings (KGE) has been relatively unexplored. We present KGEx, a novel post-hoc method that explains individual link predictions by drawing inspiration from surrogate models research. Given a target triple to predict, KGEx trains surrogate KGE models that we use to identify important training triples. To gauge the impact of a training triple, we sample random portions of the target triple neighborhood and we train multiple surrogate KGE models on each of them. To ensure faithfulness, each surrogate is trained by distilling knowledge from the original KGE model. We then assess how well surrogates predict the target triple being explained, the intuition being that those leading to faithful predictions have been trained on impactful neighborhood samples. Under this assumption, we then harvest triples that appear frequently across impactful neighborhoods. We conduct extensive experiments on two publicly available datasets, to demonstrate that KGEx is capable of providing explanations faithful to the black-box model.