LGCLCVOct 3, 2023

BYOM: Building Your Own Multi-Task Model For Free

arXiv:2310.01886v36 citationsh-index: 13
Originality Incremental advance
AI Analysis

This addresses the issue of building efficient multi-task models without retraining for researchers and practitioners, though it is incremental as it builds on existing merging methods.

The paper tackles the problem of performance deterioration in multi-task model merging by proposing BYOM, which injects task-specific knowledge into merged models using parameter-efficient approaches (BYOM-FFT and BYOM-LoRA). The result shows that BYOM outperforms existing merging methods by a large margin in experiments on computer vision and natural language processing tasks.

Recently, various merging methods have been proposed to build a multi-task model from task-specific finetuned models without retraining. However, existing methods suffer from a large performance deterioration compared to using multiple task-specific models. In this paper, we propose to inject task-specific knowledge into the merged model and design two parameter-efficient approaches (BYOM-FFT and BYOM-LoRA) to Build Your Own Multi-task model. BYOM-FFT is for merging fully finetuned models, while BYOM-LoRA is for LoRA-finetuned models. Both methods are data-free and computation-efficient. Extensive experiments on computer vision and natural language processing tasks show that the proposed BYOM methods outperform existing merging methods by a large margin. Moreover, BYOM-FFT is general and can be integrated into existing merging methods to further boost performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes