CLMay 22, 2025

Locate-then-Merge: Neuron-Level Parameter Fusion for Mitigating Catastrophic Forgetting in Multimodal LLMs

arXiv:2505.16703v13 citationsh-index: 15EMNLP
Originality Incremental advance
AI Analysis

This addresses the problem of catastrophic forgetting in multimodal LLMs for AI researchers and practitioners, offering an incremental improvement over existing model merging techniques.

The paper tackles catastrophic forgetting of language abilities in multimodal large language models during instruction tuning by proposing Locate-then-Merge, a training-free parameter fusion framework with Neuron-Fusion, which selectively merges neurons based on parameter shifts to preserve visual adaptation while mitigating language degradation, achieving consistent outperformance over existing methods on 13 benchmarks across language and visual tasks and reducing context hallucination.

Although multimodal large language models (MLLMs) have achieved impressive performance, the multimodal instruction tuning stage often causes catastrophic forgetting of the base LLM's language ability, even in strong models like Llama3. To address this, we propose Locate-then-Merge, a training-free parameter fusion framework that first locates important parameters and then selectively merges them. We further introduce Neuron-Fusion, a neuron-level strategy that preserves the influence of neurons with large parameter shifts--neurons likely responsible for newly acquired visual capabilities--while attenuating the influence of neurons with smaller changes that likely encode general-purpose language skills. This design enables better retention of visual adaptation while mitigating language degradation. Experiments on 13 benchmarks across both language and visual tasks show that Neuron-Fusion consistently outperforms existing model merging methods. Further analysis reveals that our method effectively reduces context hallucination in generation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes