CLAug 30, 2022

To Adapt or to Fine-tune: A Case Study on Abstractive Summarization

arXiv:2208.14559v1582 citationsh-index: 13
Originality Synthesis-oriented
AI Analysis

This work addresses the efficiency-performance trade-off for researchers and practitioners in NLP, but it is incremental as it compares existing methods on summarization tasks.

The study investigated whether using lightweight adapters instead of full fine-tuning improves efficiency without sacrificing performance in abstractive summarization, finding that fine-tuning generally performs better except under extremely low-resource conditions where adapters excel.

Recent advances in the field of abstractive summarization leverage pre-trained language models rather than train a model from scratch. However, such models are sluggish to train and accompanied by a massive overhead. Researchers have proposed a few lightweight alternatives such as smaller adapters to mitigate the drawbacks. Nonetheless, it remains uncertain whether using adapters benefits the task of summarization, in terms of improved efficiency without an unpleasant sacrifice in performance. In this work, we carry out multifaceted investigations on fine-tuning and adapters for summarization tasks with varying complexity: language, domain, and task transfer. In our experiments, fine-tuning a pre-trained language model generally attains a better performance than using adapters; the performance gap positively correlates with the amount of training data used. Notably, adapters exceed fine-tuning under extremely low-resource conditions. We further provide insights on multilinguality, model convergence, and robustness, hoping to shed light on the pragmatic choice of fine-tuning or adapters in abstractive summarization.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes