Fill in the BLANC: Human-free quality estimation of document summaries
This provides a fully automated, human-free method for evaluating summary quality, which is incremental as it builds on existing language models and benchmarks.
The paper tackles the problem of automatically estimating document summary quality without human references by introducing BLANC, which measures the performance boost of a pre-trained language model using the summary, achieving correlation with human evaluations comparable to ROUGE.
We present BLANC, a new approach to the automatic estimation of document summary quality. Our goal is to measure the functional performance of a summary with an objective, reproducible, and fully automated method. Our approach achieves this by measuring the performance boost gained by a pre-trained language model with access to a document summary while carrying out its language understanding task on the document's text. We present evidence that BLANC scores have as good correlation with human evaluations as do the ROUGE family of summary quality measurements. And unlike ROUGE, the BLANC method does not require human-written reference summaries, allowing for fully human-free summary quality estimation.