CVAIJun 16, 2025

DualEdit: Dual Editing for Knowledge Updating in Vision-Language Models

arXiv:2506.13638v24 citationsh-index: 11Has Code
Originality Incremental advance
AI Analysis

This addresses the challenge of model editing for VLMs, a domain-specific problem, but is incremental as it builds on existing editing methods by extending them to multi-modal contexts.

The paper tackles the problem of efficiently updating knowledge in vision-language models (VLMs) without full retraining, proposing DualEdit, which edits both textual and visual modalities at key layers and includes a gating module to preserve original capabilities, demonstrating superiority over state-of-the-art baselines on multiple backbones and datasets.

Model editing aims to efficiently update a pre-trained model's knowledge without the need for time-consuming full retraining. While existing pioneering editing methods achieve promising results, they primarily focus on editing single-modal language models (LLMs). However, for vision-language models (VLMs), which involve multiple modalities, the role and impact of each modality on editing performance remain largely unexplored. To address this gap, we explore the impact of textual and visual modalities on model editing and find that: (1) textual and visual representations reach peak sensitivity at different layers, reflecting their varying importance; and (2) editing both modalities can efficiently update knowledge, but this comes at the cost of compromising the model's original capabilities. Based on our findings, we propose DualEdit, an editor that modifies both textual and visual modalities at their respective key layers. Additionally, we introduce a gating module within the more sensitive textual modality, allowing DualEdit to efficiently update new knowledge while preserving the model's original information. We evaluate DualEdit across multiple VLM backbones and benchmark datasets, demonstrating its superiority over state-of-the-art VLM editing baselines as well as adapted LLM editing methods on different evaluation metrics. Codes are available at https://github.com/zhiyiscs/DualEdit

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes