CVMay 29, 2025

Navigating the Accuracy-Size Trade-Off with Flexible Model Merging

arXiv:2505.23209v21 citationsh-index: 54
Originality Highly original
AI Analysis

This addresses the storage and accuracy challenges in deploying multi-task models for practitioners, offering a flexible solution but is incremental in advancing model merging techniques.

The paper tackles the accuracy-size trade-off in model merging by proposing FlexMerge, a framework that generates merged models of varying sizes and supports multiple algorithms, revealing that modest size increases can yield up to 13.5% accuracy gains and algorithm rankings change with size.

Model merging has emerged as an efficient method to combine multiple single-task fine-tuned models. The merged model can enjoy multi-task capabilities without expensive training. While promising, merging into a single model often suffers from an accuracy gap with respect to the fine-tuned models. On the other hand, deploying all individual fine-tuned models incurs high storage costs. We propose FlexMerge, a novel data-free model merging framework that: (a) flexibly generates merged models of varying sizes, spanning the full spectrum from a single merged model to retaining all fine-tuned models; and (b) supports multiple merging algorithms in a unified framework. Using FlexMerge, we systematically characterize the accuracy-size trade-off of different algorithms. Our study reveals two key findings: first, even modestly larger merged models can yield steep accuracy gains (up to 13.5% when just doubling the size); second, algorithm rankings are not consistent as size increases, with some methods overtaking others beyond the one-model regime. These results uncover a new design dimension for model merging: developing and comparing algorithms across the full spectrum of sizes rather than only at the single-model limit. Extensive experiments on vision and NLP benchmarks, with up to 30 tasks, confirm the generality and practicality of FlexMerge.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes