LGAICVMLJul 7, 2024

Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis

MILA
arXiv:2407.05385v113 citationsh-index: 12Has Code
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient model fusion for machine learning practitioners, offering an incremental improvement over existing alignment techniques.

The paper tackles the problem of merging multiple neural networks into a single model to reduce computational and storage costs, proposing CCA Merge based on Canonical Correlation Analysis, which outperforms previous methods in accuracy when merging models trained on the same or different data splits, including settings with more than two models.

Combining the predictions of multiple trained models through ensembling is generally a good way to improve accuracy by leveraging the different learned features of the models, however it comes with high computational and storage costs. Model fusion, the act of merging multiple models into one by combining their parameters reduces these costs but doesn't work as well in practice. Indeed, neural network loss landscapes are high-dimensional and non-convex and the minima found through learning are typically separated by high loss barriers. Numerous recent works have been focused on finding permutations matching one network features to the features of a second one, lowering the loss barrier on the linear path between them in parameter space. However, permutations are restrictive since they assume a one-to-one mapping between the different models' neurons exists. We propose a new model merging algorithm, CCA Merge, which is based on Canonical Correlation Analysis and aims to maximize the correlations between linear combinations of the model features. We show that our alignment method leads to better performances than past methods when averaging models trained on the same, or differing data splits. We also extend this analysis into the harder setting where more than 2 models are merged, and we find that CCA Merge works significantly better than past methods. Our code is publicly available at https://github.com/shoroi/align-n-merge

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes