LGAICLFeb 17, 2025

Optimal Brain Iterative Merging: Mitigating Interference in LLM Merging

arXiv:2502.12217v13 citationsh-index: 3
Originality Incremental advance
AI Analysis

This addresses a domain-specific problem for researchers and practitioners customizing LLMs with cost-effective merging, though it appears incremental as it builds on existing merging methods.

The paper tackles the problem of interference in model merging for Large Language Models (LLMs), which causes performance degradation, by proposing Optimal Brain Iterative Merging (OBIM) to mitigate intra-model and inter-model interference, resulting in significant outperformance over existing merging techniques.

Large Language Models (LLMs) have demonstrated impressive capabilities, but their high computational costs pose challenges for customization. Model merging offers a cost-effective alternative, yet existing methods suffer from interference among parameters, leading to performance degradation. In this work, we propose Optimal Brain Iterative Merging (OBIM), a novel method designed to mitigate both intra-model and inter-model interference. OBIM consists of two key components: (1) A saliency measurement mechanism that evaluates parameter importance based on loss changes induced by individual weight alterations, reducing intra-model interference by preserving only high-saliency parameters. (2) A mutually exclusive iterative merging framework, which incrementally integrates models using a binary mask to avoid direct parameter averaging, thereby mitigating inter-model interference. We validate OBIM through experiments on both Supervised Fine-Tuned (SFT) models and post-pretrained checkpoints. The results show that OBIM significantly outperforms existing merging techniques. Overall, OBIM provides an effective and practical solution for enhancing LLM merging.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes