CLMay 22, 2023

InheritSumm: A General, Versatile and Compact Summarizer by Distilling from GPT

Yichong Xu, Ruochen Xu, Dan Iter, Yang Liu, Shuohang Wang, Chenguang Zhu, Michael Zeng

arXiv:2305.13083v121.6138 citations

Originality Incremental advance

AI Analysis

This addresses the problem of deploying efficient and high-quality summarization models for applications hindered by the costs of large models, though it is incremental as it builds on existing distillation techniques.

The paper tackles the high cost and inferior human-evaluated quality of smaller summarization models compared to GPT-3.5 by proposing InheritSumm, a compact model distilled from GPT-3.5, which achieves similar or superior performance in zero-shot and few-shot settings and outperforms previous best small models in fine-tuning scenarios.

While large models such as GPT-3 demonstrate exceptional performance in zeroshot and fewshot summarization tasks, their extensive serving and fine-tuning costs hinder their utilization in various applications. Conversely, previous studies have found that although automatic metrics tend to favor smaller fine-tuned models, the quality of the summaries they generate is inferior to that of larger models like GPT-3 when assessed by human evaluators. To address this issue, we propose InheritSumm, a versatile and compact summarization model derived from GPT-3.5 through distillation. InheritSumm not only exhibits comparable zeroshot and fewshot summarization capabilities to GPT-3.5 but is also sufficiently compact for fine-tuning purposes. Experimental results demonstrate that InheritSumm achieves similar or superior performance to GPT-3.5 in zeroshot and fewshot settings. Furthermore, it outperforms the previously established best small models in both prefix-tuning and full-data fine-tuning scenarios.

View on arXiv PDF

Similar