CLMar 30, 2021

Evaluating the Morphosyntactic Well-formedness of Generated Texts

arXiv:2103.16590v2665 citations
Originality Incremental advance
AI Analysis

This addresses the problem of evaluating text quality for researchers and practitioners in NLP, particularly for multilingual applications, but it is incremental as it builds on existing parsing and rule-based methods.

The paper tackles the challenge of evaluating text generation systems in multilingual settings by proposing L'AMBRE, a metric that assesses morphosyntactic well-formedness using dependency parses and language rules, and demonstrates its effectiveness through a diachronic study of machine translation into morphologically-rich languages.

Text generation systems are ubiquitous in natural language processing applications. However, evaluation of these systems remains a challenge, especially in multilingual settings. In this paper, we propose L'AMBRE -- a metric to evaluate the morphosyntactic well-formedness of text using its dependency parse and morphosyntactic rules of the language. We present a way to automatically extract various rules governing morphosyntax directly from dependency treebanks. To tackle the noisy outputs from text generation systems, we propose a simple methodology to train robust parsers. We show the effectiveness of our metric on the task of machine translation through a diachronic study of systems translating into morphologically-rich languages.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes