CL AIFeb 19, 2024

Surprising Efficacy of Fine-Tuned Transformers for Fact-Checking over Larger Language Models

arXiv:2402.12147v38.721 citationsh-index: 4SIGIR

Originality Incremental advance

AI Analysis

This work addresses the problem of scalable and accurate fact-checking for multilingual and complex claims, though it is incremental as it builds on existing fine-tuning and LLM methods.

The paper tackled the challenge of building an end-to-end fact-checking pipeline for over 90 languages, finding that fine-tuned Transformer models outperform large language models like GPT-4 in tasks such as claim detection and veracity prediction, with superior performance demonstrated in multilingual settings and on complex claims involving numerical quantities.

In this paper, we explore the challenges associated with establishing an end-to-end fact-checking pipeline in a real-world context, covering over 90 languages. Our real-world experimental benchmarks demonstrate that fine-tuning Transformer models specifically for fact-checking tasks, such as claim detection and veracity prediction, provide superior performance over large language models (LLMs) like GPT-4, GPT-3.5-Turbo, and Mistral-7b. However, we illustrate that LLMs excel in generative tasks such as question decomposition for evidence retrieval. Through extensive evaluation, we show the efficacy of fine-tuned models for fact-checking in a multilingual setting and complex claims that include numerical quantities.

View on arXiv PDF

Similar