Fine-tuning Aligned Classifiers for Merging Outputs: Towards a Superior Evaluation Protocol in Model Merging
This work addresses a specific bottleneck in model merging for classification, offering an incremental improvement to evaluation protocols and performance in this domain.
The paper tackles the misalignment issue between merging outputs and fine-tuned classifiers in model merging for classification tasks, showing that alleviating this misalignment significantly enhances performance, and proposes FT-Classifier, a new protocol that fine-tunes an aligned classifier with few-shot unlabeled samples to improve evaluation and classification results.
Model merging combines multiple fine-tuned models into a single one via parameter fusion, achieving improvements across many tasks. However, in the classification task, we find a misalignment issue between merging outputs and the fine-tuned classifier, which limits its effectiveness. In this paper, we first demonstrate the following observations: (1) Merging outputs exhibit the comparable cluster effect with fine-tuned outputs, and already contain necessary classification information; (2) The misalignment between merging outputs and the fine-tuned classifier can converge to an orthogonal transformation, and alleviating this misalignment can significantly enhance the performance of merging models. Based on these observations, we then propose a new protocol FT-Classifier, which fine-tunes an aligned classifier with few-shot unlabeled samples, enabling better evaluation of merging methods and improved classification performance.