Team Papelo: Transformer Networks at FEVER
This work addresses the problem of automated fact-checking for researchers and practitioners in natural language processing, but it is incremental as it builds on existing transformer methods for a specific benchmark.
The paper tackles the FEVER fact extraction and verification challenge by developing a system that uses a high-precision transformer-based entailment classifier to evaluate evidence from multiple articles, achieving a FEVER score of 0.5736, label accuracy of 0.6108, and evidence F1 of 0.6485 in preliminary evaluation.
We develop a system for the FEVER fact extraction and verification challenge that uses a high precision entailment classifier based on transformer networks pretrained with language modeling, to classify a broad set of potential evidence. The precision of the entailment classifier allows us to enhance recall by considering every statement from several articles to decide upon each claim. We include not only the articles best matching the claim text by TFIDF score, but read additional articles whose titles match named entities and capitalized expressions occurring in the claim text. The entailment module evaluates potential evidence one statement at a time, together with the title of the page the evidence came from (providing a hint about possible pronoun antecedents). In preliminary evaluation, the system achieves .5736 FEVER score, .6108 label accuracy, and .6485 evidence F1 on the FEVER shared task test set.