AILGFeb 6, 2025

Fine, I'll Merge It Myself: A Multi-Fidelity Framework for Automated Model Merging

arXiv:2502.04030v24 citationsh-index: 6
AI Analysis

This work addresses the need for efficient model merging to enhance reasoning capabilities in LLMs, offering an incremental improvement over existing manual approaches.

The paper tackles the problem of manually designing merging strategies for large language models by proposing an automated multi-fidelity framework that explores merging strategies efficiently, achieving performance boosts and multi-objective optimizations with limited compute, such as within 500 search steps.

Reasoning capabilities represent a critical frontier for large language models (LLMs), but developing them requires extensive proprietary datasets and computational resources. One way to efficiently supplement capabilities with is by model merging, which offers a promising alternative by combining multiple models without retraining. However, current merging approaches rely on manually-designed strategies for merging hyperparameters, limiting the exploration of potential model combinations and requiring significant human effort. We propose an Automated Model Merging Framework that enables fine-grained exploration of merging strategies while reducing costs through multi-fidelity approximations. We support both single and multi-objective optimization and introduce two novel search spaces: layerwise fusion (LFS) and depth-wise integration (DIS). Evaluating across a number of benchmarks, we find that the search autonomously finds 1) Merges that further boost single-objective performance, even on tasks the model has already been finetuned on, and 2) Merges that optimize multi-objective frontiers across tasks. Effective merges are found with limited compute, e.g. within less than 500 search steps.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes