CLAISep 20, 2025

PruneCD: Contrasting Pruned Self Model to Improve Decoding Factuality

arXiv:2509.16598v23 citationsh-index: 4EMNLP
Originality Incremental advance
AI Analysis

This addresses hallucination issues in large language models for users requiring reliable outputs, though it is incremental over prior methods like DoLa.

The authors tackled the problem of hallucination in large language models by proposing PruneCD, a contrastive decoding method that uses layer pruning instead of early exit to construct an amateur model, resulting in improved factuality with minimal inference overhead.

To mitigate the hallucination problem in large language models, DoLa exploits early exit logits from the same model as a contrastive prior. However, we found that these early exit logits tend to be flat, low in magnitude, and fail to reflect meaningful contrasts. To address this, we propose PruneCD, a novel contrastive decoding method that constructs the amateur model via layer pruning rather than early exit. This design leads to more informative and well-aligned logits, enabling more effective contrastive decoding. Through qualitative and quantitative analyses, we demonstrate that PruneCD consistently improves factuality with minimal inference overhead, offering a robust and practical approach to mitigating hallucinations in LLMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes