CL LGOct 13, 2020

Improving Text Generation Evaluation with Batch Centering and Tempered Word Mover Distance

Xi Chen, Nan Ding, Tomer Levinboim, Radu Soricut

arXiv:2010.06150v131.0995 citations

Originality Incremental advance

AI Analysis

This work addresses the need for more reliable automatic evaluation metrics in natural language processing, though it is incremental as it builds on existing BERT-based methods.

The paper tackled the problem of sub-optimal statistical properties in contextualized word representations for text generation evaluation by introducing batch-mean centering and tempered Word Mover Distance, achieving state-of-the-art correlation with human ratings on several benchmarks.

Recent advances in automatic evaluation metrics for text have shown that deep contextualized word representations, such as those generated by BERT encoders, are helpful for designing metrics that correlate well with human judgements. At the same time, it has been argued that contextualized word representations exhibit sub-optimal statistical properties for encoding the true similarity between words or sentences. In this paper, we present two techniques for improving encoding representations for similarity metrics: a batch-mean centering strategy that improves statistical properties; and a computationally efficient tempered Word Mover Distance, for better fusion of the information in the contextualized word representations. We conduct numerical experiments that demonstrate the robustness of our techniques, reporting results over various BERT-backbone learned metrics and achieving state of the art correlation with human ratings on several benchmarks.

View on arXiv PDF

Similar