CLNov 12, 2023

Trusted Source Alignment in Large Language Models

Vasilisa Bashlovkina, Zhaobin Kuang, Riley Matthews, Edward Clifford, Yennie Jun, William W. Cohen, Simon Baumgartner

arXiv:2311.06697v12.54 citationsh-index: 6

Originality Incremental advance

AI Analysis

This addresses the issue of unreliable information in LLMs for users needing factual accuracy, but it is incremental as it focuses on evaluation rather than a new solution.

The paper tackles the problem of large language models (LLMs) aligning with contradictory factual information by proposing trusted source alignment (TSA) as a measure, and finds that scaling up PaLM-2 model size improves performance on the FactCheckQA dataset from near-random to up to 80% balanced accuracy.

Large language models (LLMs) are trained on web-scale corpora that inevitably include contradictory factual information from sources of varying reliability. In this paper, we propose measuring an LLM property called trusted source alignment (TSA): the model's propensity to align with content produced by trusted publishers in the face of uncertainty or controversy. We present FactCheckQA, a TSA evaluation dataset based on a corpus of fact checking articles. We describe a simple protocol for evaluating TSA and offer a detailed analysis of design considerations including response extraction, claim contextualization, and bias in prompt formulation. Applying the protocol to PaLM-2, we find that as we scale up the model size, the model performance on FactCheckQA improves from near-random to up to 80% balanced accuracy in aligning with trusted sources.

View on arXiv PDF

Similar