CLMay 29, 2025

From Parameters to Prompts: Understanding and Mitigating the Factuality Gap between Fine-Tuned LLMs

arXiv:2505.23410v1h-index: 2
Originality Incremental advance
AI Analysis

This addresses the problem of factual inaccuracies in LLMs for AI practitioners, offering an incremental improvement by leveraging existing prompts.

The paper investigates the factuality gap in fine-tuned LLMs between known and unknown knowledge, finding that inference-stage techniques like in-context learning can mitigate this gap, with theoretical proof from knowledge graphs.

Factual knowledge extraction aims to explicitly extract knowledge parameterized in pre-trained language models for application in downstream tasks. While prior work has been investigating the impact of supervised fine-tuning data on the factuality of large language models (LLMs), its mechanism remains poorly understood. We revisit this impact through systematic experiments, with a particular focus on the factuality gap that arises when fine-tuning on known versus unknown knowledge. Our findings show that this gap can be mitigated at the inference stage, either under out-of-distribution (OOD) settings or by using appropriate in-context learning (ICL) prompts (i.e., few-shot learning and Chain of Thought (CoT)). We prove this phenomenon theoretically from the perspective of knowledge graphs, showing that the test-time prompt may diminish or even overshadow the impact of fine-tuning data and play a dominant role in knowledge extraction. Ultimately, our results shed light on the interaction between finetuning data and test-time prompt, demonstrating that ICL can effectively compensate for shortcomings in fine-tuning data, and highlighting the need to reconsider the use of ICL prompting as a means to evaluate the effectiveness of fine-tuning data selection methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes