LGAIFeb 15, 2025

Superpose Task-specific Features for Model Merging

arXiv:2502.10698v24 citationsh-index: 3Has CodeEMNLP
Originality Incremental advance
AI Analysis

This work addresses the challenge of enabling multi-task capabilities in neural networks without additional training, representing an incremental improvement over prior model merging methods.

The paper tackles the problem of model merging in neural networks by proposing a method that superposes task-specific features from individual models into a merged model, resulting in outperforming existing techniques across diverse benchmarks.

Model merging enables powerful capabilities in neural networks without requiring additional training. In this paper, we introduce a novel perspective on model merging by leveraging the fundamental mechanisms of neural network representation. Our approach is motivated by the linear representation hypothesis, which states that neural networks encode information through linear combinations of feature vectors. We propose a method that superposes task-specific features from individual models into a merged model. Our approach specifically targets linear transformation matrices, which are crucial for feature activation and extraction in deep networks. By formulating the merging process as a linear system, we can preserve task-specific features from individual models and create merged models that effectively maintain multi-task capabilities compared to existing methods. Extensive experiments across diverse benchmarks and models demonstrate that our method outperforms existing techniques. Code is available at https://github.com/LARS-research/STF.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes