LGAIMay 27, 2025

TuneComp: Joint Fine-tuning and Compression for Large Foundation Models

arXiv:2505.21835v1h-index: 5
Originality Incremental advance
AI Analysis

This addresses the problem of model size reduction for AI practitioners, but it is incremental as it builds on existing compression techniques.

The paper tackles the performance gap and inefficiency of sequential fine-tuning and compression for large foundation models by proposing a joint approach that fine-tunes and compresses simultaneously, resulting in significant outperformance over sequential methods.

To reduce model size during post-training, compression methods, including knowledge distillation, low-rank approximation, and pruning, are often applied after fine-tuning the model. However, sequential fine-tuning and compression sacrifices performance, while creating a larger than necessary model as an intermediate step. In this work, we aim to reduce this gap, by directly constructing a smaller model while guided by the downstream task. We propose to jointly fine-tune and compress the model by gradually distilling it to a pruned low-rank structure. Experiments demonstrate that joint fine-tuning and compression significantly outperforms other sequential compression methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes