Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases
This challenges the assumption that pre-trained language models can be used as factual knowledge bases, which is important for researchers and practitioners in NLP and knowledge representation.
The paper investigates how masked language models (MLMs) like BERT extract factual knowledge, finding that their previously reported competitive performance largely stems from biased prompts overfitting dataset artifacts rather than genuine knowledge, and strongly questions whether current MLMs can serve as reliable knowledge bases.
Previous literatures show that pre-trained masked language models (MLMs) such as BERT can achieve competitive factual knowledge extraction performance on some datasets, indicating that MLMs can potentially be a reliable knowledge source. In this paper, we conduct a rigorous study to explore the underlying predicting mechanisms of MLMs over different extraction paradigms. By investigating the behaviors of MLMs, we find that previous decent performance mainly owes to the biased prompts which overfit dataset artifacts. Furthermore, incorporating illustrative cases and external contexts improve knowledge prediction mainly due to entity type guidance and golden answer leakage. Our findings shed light on the underlying predicting mechanisms of MLMs, and strongly question the previous conclusion that current MLMs can potentially serve as reliable factual knowledge bases.