COVID-Fact: Fact Extraction and Verification of Real-World Claims on COVID-19 Pandemic
This addresses misinformation detection for the COVID-19 domain, offering a resource that is incremental in automating dataset creation.
The authors tackled the problem of misinformation detection by creating COVID-Fact, a dataset of 4,086 claims about COVID-19 with evidence and counter-claims, using automatic methods to reduce annotation costs. Their experiments show it provides a challenging testbed for new systems and lowers dataset construction expenses.
We introduce a FEVER-like dataset COVID-Fact of $4,086$ claims concerning the COVID-19 pandemic. The dataset contains claims, evidence for the claims, and contradictory claims refuted by the evidence. Unlike previous approaches, we automatically detect true claims and their source articles and then generate counter-claims using automatic methods rather than employing human annotators. Along with our constructed resource, we formally present the task of identifying relevant evidence for the claims and verifying whether the evidence refutes or supports a given claim. In addition to scientific claims, our data contains simplified general claims from media sources, making it better suited for detecting general misinformation regarding COVID-19. Our experiments indicate that COVID-Fact will provide a challenging testbed for the development of new systems and our approach will reduce the costs of building domain-specific datasets for detecting misinformation.