LGAIAug 27, 2025

PSO-Merging: Merging Models Based on Particle Swarm Optimization

arXiv:2508.19839v1h-index: 18
Originality Incremental advance
AI Analysis

This addresses the challenge of computationally expensive or ineffective model merging for AI practitioners, offering a more efficient and scalable solution, though it appears incremental as it builds on existing gradient-free methods.

The paper tackled the problem of merging multiple expert models into a multitask model efficiently, introducing PSO-Merging based on Particle Swarm Optimization, which outperformed baseline methods in experiments on language models.

Model merging has emerged as an efficient strategy for constructing multitask models by integrating the strengths of multiple available expert models, thereby reducing the need to fine-tune a pre-trained model for all the tasks from scratch. Existing data-independent methods struggle with performance limitations due to the lack of data-driven guidance. Data-driven approaches also face key challenges: gradient-based methods are computationally expensive, limiting their practicality for merging large expert models, whereas existing gradient-free methods often fail to achieve satisfactory results within a limited number of optimization steps. To address these limitations, this paper introduces PSO-Merging, a novel data-driven merging method based on the Particle Swarm Optimization (PSO). In this approach, we initialize the particle swarm with a pre-trained model, expert models, and sparsified expert models. We then perform multiple iterations, with the final global best particle serving as the merged model. Experimental results on different language models show that PSO-Merging generally outperforms baseline merging methods, offering a more efficient and scalable solution for model merging.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes