FREE-Switch: Frequency-based Dynamic LoRA Switch for Style Transfer
For practitioners needing low-cost customized image generation by merging pretrained adapters, this method offers a training-free solution that mitigates content drift and detail loss.
The paper tackles the problem of combining multiple LoRA adapters for diffusion-based image generation without content drift or detail degradation. The proposed FREE-Switch framework uses frequency-domain importance-driven dynamic switching and automatic generation alignment, achieving efficient adapter fusion with reduced training cost.
With the growing availability of open-sourced adapters trained on the same diffusion backbone for diverse scenes and objects, combining these pretrained weights enables low-cost customized generation. However, most existing model merging methods are designed for classification or text generation, and when applied to image generation, they suffer from content drift due to error accumulation across multiple diffusion steps. For image-oriented methods, training-based approaches are computationally expensive and unsuitable for edge deployment, while training-free ones use uniform fusion strategies that ignore inter-adapter differences, leading to detail degradation. We find that since different adapters are specialized for generating different types of content, the contribution of each diffusion step carries different significance for each adapter. Accordingly, we propose a frequency-domain importance-driven dynamic LoRA switch method. Furthermore, we observe that maintaining semantic consistency across adapters effectively mitigates detail loss; thus, we design an automatic Generation Alignment mechanism to align generation intents at the semantic level. Experiments demonstrate that our FREE-Switch (Frequency-based Efficient and Dynamic LoRA Switch) framework efficiently combines adapters for different objects and styles, substantially reducing the training cost of high-quality customized generation.