LG AI CLOct 15, 2025

K-Merge: Online Continual Merging of Adapters for On-device Large Language Models

Donald Shenaj, Ondrej Bohdal, Taha Ceritli, Mete Ozay, Pietro Zanuttigh, Umberto Michieli

arXiv:2510.13537v19.42 citationsh-index: 19

Originality Incremental advance

AI Analysis

This addresses the problem of managing diverse tasks on resource-constrained mobile devices for users needing incremental LLM support, representing an incremental improvement over existing model merging techniques.

The paper tackles the challenge of on-device online continual merging of Low-Rank Adapters (LoRAs) for Large Language Models, where new adapters are incrementally added under storage constraints, and proposes a data-free, efficient strategy that outperforms alternatives in real-world tasks while adhering to device limitations.

On-device deployment of Large Language Models (LLMs) frequently leverages Low-Rank Adapters (LoRAs) to support diverse downstream tasks under tight resource constraints. To address the limited storage capacity of mobile devices, recent works have explored model merging techniques to fuse multiple LoRAs into a single one. In practice, however, LoRAs are often delivered incrementally, as users request support for new tasks (e.g., novel problem types or languages). This scenario introduces a new challenge: on-device online continual merging, where the objective is to incorporate new LoRAs while preserving the performance on previously supported tasks. In this paper, we propose a data-free and computationally efficient strategy for selecting and merging LoRAs when a new one becomes available, assuming the device can store only a limited number of adapters. Extensive experiments across real-world tasks demonstrate the superiority of our approach compared to alternative strategies while adhering to the storage budget and compute limitations of on-device settings.

View on arXiv PDF

Similar