AINov 13, 2025

Causal-HalBench: Uncovering LVLMs Object Hallucinations Through Causal Intervention

Zhe Xu, Zhicai Wang, Junkang Wu, Jinda Lu, Xiang Wang

arXiv:2511.10268v15.82 citationsh-index: 10

Originality Incremental advance

AI Analysis

This addresses the issue of unreliable object recognition in LVLMs for applications like image captioning and visual question answering, but it is incremental as it builds on existing work on hallucination detection by adding causal formalization.

The paper tackles the problem of object hallucination in Large Vision-Language Models (LVLMs) by introducing causal analysis and a benchmark called Causal-HalBench, which uses counterfactual samples to quantify spurious correlations, showing that mainstream LVLMs are susceptible to these biases to varying degrees.

Large Vision-Language Models (LVLMs) often suffer from object hallucination, making erroneous judgments about the presence of objects in images. We propose this primar- ily stems from spurious correlations arising when models strongly associate highly co-occurring objects during train- ing, leading to hallucinated objects influenced by visual con- text. Current benchmarks mainly focus on hallucination de- tection but lack a formal characterization and quantitative evaluation of spurious correlations in LVLMs. To address this, we introduce causal analysis into the object recognition scenario of LVLMs, establishing a Structural Causal Model (SCM). Utilizing the language of causality, we formally de- fine spurious correlations arising from co-occurrence bias. To quantify the influence induced by these spurious correla- tions, we develop Causal-HalBench, a benchmark specifically constructed with counterfactual samples and integrated with comprehensive causal metrics designed to assess model ro- bustness against spurious correlations. Concurrently, we pro- pose an extensible pipeline for the construction of these coun- terfactual samples, leveraging the capabilities of proprietary LVLMs and Text-to-Image (T2I) models for their genera- tion. Our evaluations on mainstream LVLMs using Causal- HalBench demonstrate these models exhibit susceptibility to spurious correlations, albeit to varying extents.

View on arXiv PDF

Similar