Towards Interpretable and Efficient Automatic Reference-Based Summarization Evaluation
This work addresses the need for better evaluation metrics in summarization for researchers and practitioners, though it is incremental as it builds on existing reference-based evaluation methods.
The authors tackled the problem of developing interpretable and efficient automatic metrics for reference-based summarization evaluation, resulting in a two-stage pipeline that extracts information units and checks them in sequences, with tools achieving high interpretability and a balance between efficiency and interpretability, made publicly available.
Interpretability and efficiency are two important considerations for the adoption of neural automatic metrics. In this work, we develop strong-performing automatic metrics for reference-based summarization evaluation, based on a two-stage evaluation pipeline that first extracts basic information units from one text sequence and then checks the extracted units in another sequence. The metrics we developed include two-stage metrics that can provide high interpretability at both the fine-grained unit level and summary level, and one-stage metrics that achieve a balance between efficiency and interpretability. We make the developed tools publicly available at https://github.com/Yale-LILY/AutoACU.