AICVApr 13, 2025

Don't Deceive Me: Mitigating Gaslighting through Attention Reallocation in LMMs

arXiv:2504.09456v16 citationsh-index: 26Has Code
Originality Incremental advance
AI Analysis

This addresses reliability concerns for LMMs in real-world applications, though it is incremental as it builds on existing models and benchmarks.

The paper tackles the problem of negation-based gaslighting in Large Multimodal Models, where deceptive inputs cause accuracy drops, and introduces GasEraser, a training-free method that reduces the misguidance rate by 48.2% for LLaVA-v1.5-7B.

Large Multimodal Models (LMMs) have demonstrated remarkable capabilities across a wide range of tasks. However, their vulnerability to user gaslighting-the deliberate use of misleading or contradictory inputs-raises critical concerns about their reliability in real-world applications. In this paper, we address the novel and challenging issue of mitigating the negative impact of negation-based gaslighting on LMMs, where deceptive user statements lead to significant drops in model accuracy. Specifically, we introduce GasEraser, a training-free approach that reallocates attention weights from misleading textual tokens to semantically salient visual regions. By suppressing the influence of "attention sink" tokens and enhancing focus on visually grounded cues, GasEraser significantly improves LMM robustness without requiring retraining or additional supervision. Extensive experimental results demonstrate that GasEraser is effective across several leading open-source LMMs on the GaslightingBench. Notably, for LLaVA-v1.5-7B, GasEraser reduces the misguidance rate by 48.2%, demonstrating its potential for more trustworthy LMMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes