LG AIMay 27, 2025

TuneComp: Joint Fine-tuning and Compression for Large Foundation Models

Xiangyu Chen, Jing Liu, Ye Wang, Matthew Brand, Pu, Wang, Toshiaki Koike-Akino

arXiv:2505.21835v14.1h-index: 5

Originality Incremental advance

AI Analysis

This addresses the problem of model size reduction for AI practitioners, but it is incremental as it builds on existing compression techniques.

The paper tackles the performance gap and inefficiency of sequential fine-tuning and compression for large foundation models by proposing a joint approach that fine-tunes and compresses simultaneously, resulting in significant outperformance over sequential methods.

To reduce model size during post-training, compression methods, including knowledge distillation, low-rank approximation, and pruning, are often applied after fine-tuning the model. However, sequential fine-tuning and compression sacrifices performance, while creating a larger than necessary model as an intermediate step. In this work, we aim to reduce this gap, by directly constructing a smaller model while guided by the downstream task. We propose to jointly fine-tune and compress the model by gradually distilling it to a pruned low-rank structure. Experiments demonstrate that joint fine-tuning and compression significantly outperforms other sequential compression methods.

View on arXiv PDF

Similar