AIApr 5

Don't Blink: Evidence Collapse during Multimodal Reasoning

arXiv:2604.0420743.7
AI Analysis

This addresses a critical safety issue for deploying multimodal AI systems, particularly in tasks requiring sustained visual reference, by revealing and mitigating a failure mode that text-only monitoring cannot detect.

The study identified that reasoning vision-language models (VLMs) can become more accurate while losing visual grounding during reasoning, leading to confident but ungrounded predictions, with evidence attention dropping by over half in some cases. It proposed a targeted vision veto method that reduced selective risk by up to 1.9 percentage points at 90% coverage, based on an entropy-vision interaction model.

Reasoning VLMs can become more accurate while progressively losing visual grounding as they think. This creates task-conditional danger zones where low-entropy predictions are confident but ungrounded, a failure mode text-only monitoring cannot detect. Evaluating three reasoning VLMs on MathVista, HallusionBench, and MMMU_Pro, we find a pervasive evidence-collapse phenomenon: attention to annotated evidence regions drops substantially, often losing over half of evidence mass, as reasoning unfolds. Full-response entropy is the most reliable text-only uncertainty signal under cross-dataset transfer, yet adding vision features with a single global linear rule is brittle and often degrades transfer. An entropy-vision interaction model reveals a task-conditional regime: lowentropy, visually disengaged predictions are hazardous on sustained visual-reference tasks but benign on symbolic tasks. Using this structure, a targeted vision veto reduces selective risk by up to 1.9 percentage points at 90% coverage, while avoiding degradations where disengagement is expected. The results support task-aware multimodal monitoring for safe deployment under distribution shift.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes