A Decoding Algorithm for Length-Control Summarization Based on Directed Acyclic Transformers
This work addresses the specific problem of generating summaries with strict length limits for applications like news summarization, representing an incremental improvement over existing methods.
The paper tackled the problem of length-control summarization, where previous methods often failed to strictly meet length constraints, and proposed a novel decoding algorithm based on Directed Acyclic Transformers that achieved state-of-the-art performance on Gigaword and DUC2004 datasets.
Length-control summarization aims to condense long texts into a short one within a certain length limit. Previous approaches often use autoregressive (AR) models and treat the length requirement as a soft constraint, which may not always be satisfied. In this study, we propose a novel length-control decoding algorithm based on the Directed Acyclic Transformer (DAT). Our approach allows for multiple plausible sequence fragments and predicts a \emph{path} to connect them. In addition, we propose a Sequence Maximum a Posteriori (SeqMAP) decoding algorithm that marginalizes different possible paths and finds the most probable summary satisfying the length budget. Our algorithm is based on beam search, which further facilitates a reranker for performance improvement. Experimental results on the Gigaword and DUC2004 datasets demonstrate our state-of-the-art performance for length-control summarization.