LGMay 30

Score $\times$ Decoder: A Unified View of Unsupervised Inference-Time Scaling for Hallucination Mitigation

Yun-Chen Cheng, Che-Yu Lin, Cheng-Lin Yang

arXiv:2606.0073954.9

Predicted impact top 48% in LG · last 90 daysOriginality Incremental advance

AI Analysis

For practitioners needing to reduce hallucinations without supervision, this work provides a systematic comparison of unsupervised methods, though findings are incremental and limited to a single model and dataset.

The paper investigates unsupervised inference-time scaling for hallucination mitigation in LLMs, evaluating a grid of four intrinsic scores and three decoding families on MATH500 with Qwen3-1.7B. They find that self-verification with a virtual-thinking prefix works well, but no single score is universally best; performance depends on the decoder and model capability.

Large language models hallucinate even when the answer lies within their parameters. While inference-time scaling can surface this latent knowledge, the most effective methods require supervision: a trained verifier or reward model. We ask what can be done with only a base language model: which intrinsic signal best identifies correct outputs, and how should it be decoded? We cast this as a score~$\times$~decoder grid pairing four scores (perplexity, contrastive, power-distribution likelihood, and self-verification) with three decoding families (optimization, sampling, consensus), and evaluate every cell on MATH500 with the base and instruction-tuned Qwen3-1.7B. While self-verification, which prompts the model to judge its own answer and is sharpened by a training-free virtual-thinking prefix, works well in most settings, no score has a fixed quality: its value depends on the decoder that consumes it and on model capability. When no supervision is available, the score and the decoding family must be chosen together.

View on arXiv PDF

Similar