LGDec 18, 2024

Fine-tuning Aligned Classifiers for Merging Outputs: Towards a Superior Evaluation Protocol in Model Merging

arXiv:2412.13526v21 citationsh-index: 8
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in model merging for classification, offering an incremental improvement to evaluation protocols and performance in this domain.

The paper tackles the misalignment issue between merging outputs and fine-tuned classifiers in model merging for classification tasks, showing that alleviating this misalignment significantly enhances performance, and proposes FT-Classifier, a new protocol that fine-tunes an aligned classifier with few-shot unlabeled samples to improve evaluation and classification results.

Model merging combines multiple fine-tuned models into a single one via parameter fusion, achieving improvements across many tasks. However, in the classification task, we find a misalignment issue between merging outputs and the fine-tuned classifier, which limits its effectiveness. In this paper, we first demonstrate the following observations: (1) Merging outputs exhibit the comparable cluster effect with fine-tuned outputs, and already contain necessary classification information; (2) The misalignment between merging outputs and the fine-tuned classifier can converge to an orthogonal transformation, and alleviating this misalignment can significantly enhance the performance of merging models. Based on these observations, we then propose a new protocol FT-Classifier, which fine-tunes an aligned classifier with few-shot unlabeled samples, enabling better evaluation of merging methods and improved classification performance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes