CLSep 18, 2025

V-SEAM: Visual Semantic Editing and Attention Modulating for Causal Interpretability of Vision-Language Models

arXiv:2509.14837v14 citationsh-index: 6Has CodeEMNLP
Originality Incremental advance
AI Analysis

This work addresses the need for better interpretability in vision-language models, which is crucial for researchers and practitioners in AI, though it appears incremental as it builds on existing causal interpretability methods.

The paper tackles the problem of limited semantic insights in causal interpretability of vision-language models by introducing V-SEAM, a framework that enables concept-level visual manipulations and identifies attention heads, resulting in enhanced performance for LLaVA and InstructBLIP across three VQA benchmarks.

Recent advances in causal interpretability have extended from language models to vision-language models (VLMs), seeking to reveal their internal mechanisms through input interventions. While textual interventions often target semantics, visual interventions typically rely on coarse pixel-level perturbations, limiting semantic insights on multimodal integration. In this study, we introduce V-SEAM, a novel framework that combines Visual Semantic Editing and Attention Modulating for causal interpretation of VLMs. V-SEAM enables concept-level visual manipulations and identifies attention heads with positive or negative contributions to predictions across three semantic levels: objects, attributes, and relationships. We observe that positive heads are often shared within the same semantic level but vary across levels, while negative heads tend to generalize broadly. Finally, we introduce an automatic method to modulate key head embeddings, demonstrating enhanced performance for both LLaVA and InstructBLIP across three diverse VQA benchmarks. Our data and code are released at: https://github.com/petergit1/V-SEAM.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes