CL AI LGSep 20, 2019

Towards Neural Language Evaluators

Hassan Kané, Yusuf Kocyigit, Pelkins Ajanoh, Ali Abdalla, Mohamed Coulibali

arXiv:1909.09268v20.52 citations

Originality Synthesis-oriented

AI Analysis

This work tackles the problem of improving evaluation metrics for text summarization, which is incremental as it builds on existing methods.

The paper addresses limitations of BLEU and ROUGE for evaluating summaries by proposing criteria for good metrics and using Transformer-based language models to assess reference and hypothesis summaries, but does not report specific numerical results.

We review three limitations of BLEU and ROUGE -- the most popular metrics used to assess reference summaries against hypothesis summaries, come up with criteria for what a good metric should behave like and propose concrete ways to use recent Transformers-based Language Models to assess reference summaries against hypothesis summaries.

View on arXiv PDF

Similar