Content based Weighted Consensus Summarization
This work addresses the challenge of selecting robust summarization systems for researchers and practitioners, though it is incremental as it builds on existing consensus-based ensemble approaches.
The paper tackles the problem of choosing among multiple equally performing multi-document summarization systems by proposing an ensemble method that addresses shortcomings in existing consensus-based techniques, such as ignoring relative system performance and content, and shows it outperforms existing methods by a large margin on DUC 2003 and DUC 2004 datasets.
Multi-document summarization has received a great deal of attention in the past couple of decades. Several approaches have been proposed, many of which perform equally well and it is becoming in- creasingly difficult to choose one particular system over another. An ensemble of such systems that is able to leverage the strengths of each individual systems can build a better and more robust summary. Despite this, few attempts have been made in this direction. In this paper, we describe a category of ensemble systems which use consensus between the candidate systems to build a better meta-summary. We highlight two major shortcomings of such systems: the inability to take into account relative performance of individual systems and overlooking content of candidate summaries in favour of the sentence rankings. We propose an alternate method, content-based weighted consensus summarization, which address these concerns. We use pseudo-relevant summaries to estimate the performance of individual candidate systems, and then use this information to generate a better aggregate ranking. Experiments on DUC 2003 and DUC 2004 datasets show that the proposed system outperforms existing consensus-based techniques by a large margin.