Knowledge Base Completion: Baseline strikes back (Again)
This work addresses the computational efficiency and performance evaluation in knowledge base completion for AI researchers, revealing that many recent methods become indistinguishable when trained with full negative samples, suggesting incremental insights.
The paper tackled the problem of Knowledge Base Completion by showing that using all available negative samples during training, rather than a small subset, leads to near state-of-the-art performance across standard benchmark datasets like FB15k and WN18RR, with the COMPLEX-V2 method achieving competitive results.
Knowledge Base Completion (KBC) has been a very active area lately. Several recent KBCpapers propose architectural changes, new training methods, or even new formulations. KBC systems are usually evaluated on standard benchmark datasets: FB15k, FB15k-237, WN18, WN18RR, and Yago3-10. Most existing methods train with a small number of negative samples for each positive instance in these datasets to save computational costs. This paper discusses how recent developments allow us to use all available negative samples for training. We show that Complex, when trained using all available negative samples, gives near state-of-the-art performance on all the datasets. We call this approach COMPLEX-V2. We also highlight how various multiplicative KBC methods, recently proposed in the literature, benefit from this train-ing regime and become indistinguishable in terms of performance on most datasets. Our work calls for a reassessment of their individual value, in light of these findings.