CLAug 18, 2024

Activated Parameter Locating via Causal Intervention for Model Merging

arXiv:2408.09485v14 citationsh-index: 8
Originality Incremental advance
AI Analysis

This addresses parameter redundancy issues in model merging for AI practitioners, though it appears incremental as it builds on existing parameter-dropping approaches.

The paper tackles the problem of parameter conflicts in model merging by proposing an Activated Parameter Locating (APL) method that uses causal intervention to estimate parameter importance for more precise parameter dropping, achieving effective performance in both in-domain and out-of-domain settings.

Model merging combines multiple homologous models into one model, achieving convincing generalization without the necessity of additional training. A key challenge in this problem is resolving parameter redundancies and conflicts across multiple models. Existing models have demonstrated that dropping a portion of delta parameters can alleviate conflicts while maintaining performance. However, these methods often drop parameters either randomly or based on magnitude, overlooking task-specific information embedded in fine-tuned models. In this paper, we propose an Activated Parameter Locating (APL) method that utilizes causal intervention to estimate parameter importance, enabling more precise parameter drops and better conflict mitigation. Moreover, to reduce the computational complexity associated with a large number of parameter partitions, we also introduce a theoretically supported gradient approximation strategy for APL. Experiments on model merging within both in-domain and out-of-domain settings, along with associated analyses, showcase the effectiveness of APL.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes