BYOM: Building Your Own Multi-Task Model For Free
This addresses the issue of building efficient multi-task models without retraining for researchers and practitioners, though it is incremental as it builds on existing merging methods.
The paper tackles the problem of performance deterioration in multi-task model merging by proposing BYOM, which injects task-specific knowledge into merged models using parameter-efficient approaches (BYOM-FFT and BYOM-LoRA). The result shows that BYOM outperforms existing merging methods by a large margin in experiments on computer vision and natural language processing tasks.
Recently, various merging methods have been proposed to build a multi-task model from task-specific finetuned models without retraining. However, existing methods suffer from a large performance deterioration compared to using multiple task-specific models. In this paper, we propose to inject task-specific knowledge into the merged model and design two parameter-efficient approaches (BYOM-FFT and BYOM-LoRA) to Build Your Own Multi-task model. BYOM-FFT is for merging fully finetuned models, while BYOM-LoRA is for LoRA-finetuned models. Both methods are data-free and computation-efficient. Extensive experiments on computer vision and natural language processing tasks show that the proposed BYOM methods outperform existing merging methods by a large margin. Moreover, BYOM-FFT is general and can be integrated into existing merging methods to further boost performance.