Towards Better Evolution Modeling for Temporal Knowledge Graphs
This work addresses the problem of flawed evaluation in TKG research for the AI community, but it is incremental as it focuses on improving benchmarks rather than proposing a new model.
The authors identified that existing benchmarks for temporal knowledge graph (TKG) evolution modeling contain shortcuts, such as biases in datasets and simplified evaluation tasks, allowing near state-of-the-art performance without using temporal information. They introduced a new benchmark with bias-corrected datasets and novel tasks to address these issues and promote more accurate modeling.
Temporal knowledge graphs (TKGs) structurally preserve evolving human knowledge. Recent research has focused on designing models to learn the evolutionary nature of TKGs to predict future facts, achieving impressive results. For instance, Hits@10 scores over 0.9 on YAGO dataset. However, we find that existing benchmarks inadvertently introduce a shortcut. Near state-of-the-art performance can be simply achieved by counting co-occurrences, without using any temporal information. In this work, we examine the root cause of this issue, identifying inherent biases in current datasets and over simplified form of evaluation task that can be exploited by these biases. Through this analysis, we further uncover additional limitations of existing benchmarks, including unreasonable formatting of time-interval knowledge, ignorance of learning knowledge obsolescence, and insufficient information for precise evolution understanding, all of which can amplify the shortcut and hinder a fair assessment. Therefore, we introduce the TKG evolution benchmark. It includes four bias-corrected datasets and two novel tasks closely aligned with the evolution process, promoting a more accurate understanding of the challenges in TKG evolution modeling. Benchmark is available at: https://github.com/zjs123/TKG-Benchmark.