LGAIJun 18, 2025

Model Fusion via Neuron Interpolation

ETH Zurich
arXiv:2507.00037v12 citationsh-index: 11Has Code
Originality Incremental advance
AI Analysis

This work addresses the challenge of combining multiple neural networks into a single effective model, which is incremental but offers improvements for scenarios with varied training data distributions.

The paper tackles the problem of model fusion by introducing a neuron-centric algorithm that groups intermediate neurons and uses attribution scores to create a fused model, achieving consistent outperformance over previous techniques on benchmark datasets, especially in zero-shot and non-IID scenarios.

Model fusion aims to combine the knowledge of multiple models by creating one representative model that captures the strengths of all of its parents. However, this process is non-trivial due to differences in internal representations, which can stem from permutation invariance, random initialization, or differently distributed training data. We present a novel, neuron-centric family of model fusion algorithms designed to integrate multiple trained neural networks into a single network effectively regardless of training data distribution. Our algorithms group intermediate neurons of parent models to create target representations that the fused model approximates with its corresponding sub-network. Unlike prior approaches, our approach incorporates neuron attribution scores into the fusion process. Furthermore, our algorithms can generalize to arbitrary layer types. Experimental results on various benchmark datasets demonstrate that our algorithms consistently outperform previous fusion techniques, particularly in zero-shot and non-IID fusion scenarios. The code is available at https://github.com/AndrewSpano/neuron-interpolation-model-fusion.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes