LenAtten: An Effective Length Controlling Unit For Text Summarization
This addresses the trade-off between length control and quality in summarization for applications requiring precise summary lengths.
The paper tackles the problem of fixed-length text summarization by introducing LenAtten, a length controlling unit that improves both length controllability and summary quality, achieving a 732 times better performance in length controllability compared to the best existing method on the CNN/Daily Mail dataset.
Fixed length summarization aims at generating summaries with a preset number of words or characters. Most recent researches incorporate length information with word embeddings as the input to the recurrent decoding unit, causing a compromise between length controllability and summary quality. In this work, we present an effective length controlling unit Length Attention (LenAtten) to break this trade-off. Experimental results show that LenAtten not only brings improvements in length controllability and ROGUE scores but also has great generalization ability. In the task of generating a summary with the target length, our model is 732 times better than the best-performing length controllable summarizer in length controllability on the CNN/Daily Mail dataset.