LGCVMar 28, 2024

Model Stock: All we need is just a few fine-tuned models

arXiv:2403.19522v289 citationsh-index: 38Has CodeECCV
Originality Incremental advance
AI Analysis

This addresses the computational burden for practitioners in machine learning by reducing the number of models needed for fine-tuning, though it is incremental as it builds on existing fine-tuning and averaging techniques.

The paper tackles the inefficiency of needing many fine-tuned models for averaging by introducing Model Stock, a method that uses only two fine-tuned models to achieve superior accuracy on in-distribution and out-of-distribution tasks, surpassing state-of-the-art methods like Model Soup.

This paper introduces an efficient fine-tuning method for large pre-trained models, offering strong in-distribution (ID) and out-of-distribution (OOD) performance. Breaking away from traditional practices that need a multitude of fine-tuned models for averaging, our approach employs significantly fewer models to achieve final weights yet yield superior accuracy. Drawing from key insights in the weight space of fine-tuned weights, we uncover a strong link between the performance and proximity to the center of weight space. Based on this, we introduce a method that approximates a center-close weight using only two fine-tuned models, applicable during or after training. Our innovative layer-wise weight averaging technique surpasses state-of-the-art model methods such as Model Soup, utilizing only two fine-tuned models. This strategy can be aptly coined Model Stock, highlighting its reliance on selecting a minimal number of models to draw a more optimized-averaged model. We demonstrate the efficacy of Model Stock with fine-tuned models based upon pre-trained CLIP architectures, achieving remarkable performance on both ID and OOD tasks on the standard benchmarks, all while barely bringing extra computational demands. Our code and pre-trained models are available at https://github.com/naver-ai/model-stock.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes