CLMay 27

Risk-aware Selective Prompting for Hallucination Mitigation in Large Vision-Language Models

arXiv:2605.2812379.5
Predicted impact top 71% in CL · last 90 daysOriginality Incremental advance
AI Analysis

For practitioners using LVLMs, this work reveals the nuanced risk of verification prompting and offers a training-free method to selectively apply it, improving reliability.

The paper systematically studies verification prompting in LVLMs, finding it is risk-bearing: corrections increase with input difficulty but errors persist, causing harm on easy inputs. The proposed Risk-aware Selective Prompting (RSP) uses uncertainty to trigger verification selectively, mitigating degradation while preserving baseline performance.

Prompt-based verification is widely used to mitigate hallucinations in large vision-language models (LVLMs), yet when it helps remains poorly understood. We systematically study verification prompting across two representative LVLM architectures and hallucination benchmarks, and find that it is a risk-bearing intervention: its corrections increase with input difficulty, while newly introduced errors persist across difficulty levels. As a result, always-on prompting helps on hard inputs but offers little benefit -- and can harm -- easier ones. Our analysis further shows that this behavior is associated with a conservative output shift. Verification prompts redistribute attention from visual tokens toward instruction tokens and induce a distinct middle-layer entropy pattern absent in a neutral-prompt control, suggesting instruction-conditioned attention redistribution rather than uniformly improved visual grounding. Motivated by this input-dependent risk, we propose Risk-aware Selective Prompting (RSP), a training-free approach that uses pre-generation uncertainty signals to trigger verification selectively. RSP mitigates the degradation of always-on prompting while preserving baseline performance, and reveals that effective selection signals vary across architectures.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes