Efficiency and Effectiveness of LLM-Based Summarization of Evidence in Crowdsourced Fact-Checking
This addresses the challenge of scaling fact-checking efforts for combating misinformation, though it is incremental as it applies existing LLM methods to a specific domain.
The study tackled the problem of evaluating online content truthfulness by comparing crowdsourced fact-checking using full-length webpages versus LLM-generated summaries as evidence, finding that summaries offer comparable accuracy while significantly improving efficiency, with workers completing more assessments and reducing task duration and costs.
Evaluating the truthfulness of online content is critical for combating misinformation. This study examines the efficiency and effectiveness of crowdsourced truthfulness assessments through a comparative analysis of two approaches: one involving full-length webpages as evidence for each claim, and another using summaries for each evidence document generated with a large language model. Using an A/B testing setting, we engage a diverse pool of participants tasked with evaluating the truthfulness of statements under these conditions. Our analysis explores both the quality of assessments and the behavioral patterns of participants. The results reveal that relying on summarized evidence offers comparable accuracy and error metrics to the Standard modality while significantly improving efficiency. Workers in the Summary setting complete a significantly higher number of assessments, reducing task duration and costs. Additionally, the Summary modality maximizes internal agreement and maintains consistent reliance on and perceived usefulness of evidence, demonstrating its potential to streamline large-scale truthfulness evaluations.