CL CYDec 19, 2024

Face the Facts! Evaluating RAG-based Pipelines for Professional Fact-Checking

Daniel Russo, Stefano Menini, Jacopo Staiano, Marco Guerini

arXiv:2412.15189v34.85 citationsh-index: 26Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses the need for efficient fact-checking tools for professionals, but it is incremental as it benchmarks existing RAG methods under new constraints.

The paper tackled the problem of evaluating RAG-based pipelines for automated fact-checking on complex claims and heterogeneous knowledge bases, finding that LLM-based retrievers outperform other techniques but struggle with heterogeneity, and larger models excel in verdict faithfulness while smaller ones provide better context adherence.

Natural Language Processing and Generation systems have recently shown the potential to complement and streamline the costly and time-consuming job of professional fact-checkers. In this work, we lift several constraints of current state-of-the-art pipelines for automated fact-checking based on the Retrieval-Augmented Generation (RAG) paradigm. Our goal is to benchmark, following professional fact-checking practices, RAG-based methods for the generation of verdicts - i.e., short texts discussing the veracity of a claim - evaluating them on stylistically complex claims and heterogeneous, yet reliable, knowledge bases. Our findings show a complex landscape, where, for example, LLM-based retrievers outperform other retrieval techniques, though they still struggle with heterogeneous knowledge bases; larger models excel in verdict faithfulness, while smaller models provide better context adherence, with human evaluations favouring zero-shot and one-shot approaches for informativeness, and fine-tuned models for emotional alignment.

View on arXiv PDF Code

Similar