CL AI LGMay 24, 2023

SciFix: Outperforming GPT3 on Scientific Factual Error Correction

Dhananjay Ashok, Atharva Kulkarni, Hai Pham, Barnabás Póczos

arXiv:2305.14707v20.91 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the challenge of correcting scientific factual errors without relying on costly verification models, offering a significant performance improvement in a domain-specific area.

The paper tackles the problem of factual error correction in scientific claims by introducing SciFix, a system that outperforms existing methods and GPT3.5, achieving correction accuracies of 84%, 77%, and 72% on SciFact, SciFact-Open, and CovidFact datasets, respectively.

Due to the prohibitively high cost of creating error correction datasets, most Factual Claim Correction methods rely on a powerful verification model to guide the correction process. This leads to a significant drop in performance in domains like scientific claims, where good verification models do not always exist. In this work, we introduce SciFix, a scientific claim correction system that does not require a verifier but can outperform existing methods by a considerable margin -- achieving correction accuracy of 84% on the SciFact dataset, 77% on SciFact-Open and 72% on the CovidFact dataset, compared to next best accuracies of 7%, 5%, and 15% on the same datasets respectively. Our method leverages the power of prompting with LLMs during training to create a richly annotated dataset that can be used for fully supervised training and regularization. We additionally use a claim-aware decoding procedure to improve the quality of corrected claims. Our method outperforms the very LLM that was used to generate the annotated dataset -- with Few-Shot Prompting on GPT3.5 achieving 58%, 61%, and 64% on the respective datasets, a consistently lower correction accuracy, despite using nearly 800 times as many parameters as our model.

View on arXiv PDF Code

Similar