Unsupervised Text Summarization via Mixed Model Back-Translation
This addresses the problem of generating summaries without aligned data for NLP researchers, though it is incremental as it builds on existing back-translation paradigms.
The authors tackled unsupervised sentence summarization by extending back-translation to unaligned data, achieving over 2 ROUGE improvement over the state-of-the-art and matching semi-supervised methods.
Back-translation based approaches have recently lead to significant progress in unsupervised sequence-to-sequence tasks such as machine translation or style transfer. In this work, we extend the paradigm to the problem of learning a sentence summarization system from unaligned data. We present several initial models which rely on the asymmetrical nature of the task to perform the first back-translation step, and demonstrate the value of combining the data created by these diverse initialization methods. Our system outperforms the current state-of-the-art for unsupervised sentence summarization from fully unaligned data by over 2 ROUGE, and matches the performance of recent semi-supervised approaches.