Abstractive Text Summarization Using the BRIO Training Paradigm
This work addresses the lack of control and heavy reliance on reference summaries in abstractive summarization, offering a straightforward fine-tuning technique that improves performance, particularly for low-resource languages like Vietnamese, though it is incremental as it builds on pre-trained models and existing paradigms.
The paper tackles the problem of abstractive text summarization models being overly dependent on reference summaries by introducing the BRIO training paradigm, which assumes a non-deterministic distribution to reduce this reliance and improve inference performance. Results show that models trained with BRIO on CNNDM and a new Vietnamese dataset (VieSum) outperform all existing abstractive summarization models, especially for Vietnamese, even when trained on basic hardware.
Summary sentences produced by abstractive summarization models may be coherent and comprehensive, but they lack control and rely heavily on reference summaries. The BRIO training paradigm assumes a non-deterministic distribution to reduce the model's dependence on reference summaries, and improve model performance during inference. This paper presents a straightforward but effective technique to improve abstractive summaries by fine-tuning pre-trained language models, and training them with the BRIO paradigm. We build a text summarization dataset for Vietnamese, called VieSum. We perform experiments with abstractive summarization models trained with the BRIO paradigm on the CNNDM and the VieSum datasets. The results show that the models, trained on basic hardware, outperform all existing abstractive summarization models, especially for Vietnamese.