CLFeb 25, 2025

Compressing Language Models for Specialized Domains

Miles Williams, George Chrysostomou, Vitor Jeronymo, Nikolaos Aletras

arXiv:2502.18424v14.91 citationsh-index: 29

Originality Highly original

AI Analysis

This addresses the challenge of efficiently deploying compressed language models in domains like biomedical or legal, offering a practical solution without expensive fine-tuning.

The paper tackles the problem of compressing language models for specialized domains, where existing methods cause performance drops, and proposes cross-calibration, a training-free approach that improves domain performance without compromising general performance or adding computational overhead.

Compression techniques such as pruning and quantization offer a solution for more efficient deployment of language models (LMs), albeit with small performance drops in benchmark performance. However, general-purpose LM compression methods can negatively affect performance in specialized domains (e.g. biomedical or legal). Recent work has sought to address this, yet requires computationally expensive full-parameter fine-tuning. To this end, we propose cross-calibration, a novel training-free approach for improving the domain performance of compressed LMs. Our approach effectively leverages Hessian-based sensitivity to identify weights that are influential for both in-domain and general performance. Through extensive experimentation, we demonstrate that cross-calibration substantially outperforms existing approaches on domain-specific tasks, without compromising general performance. Notably, these gains come without additional computational overhead, displaying remarkable potential towards extracting domain-specialized compressed models from general-purpose LMs.

View on arXiv PDF

Similar