CL AIOct 17, 2025

Readability Reconsidered: A Cross-Dataset Analysis of Reference-Free Metrics

Catarina G Belem, Parker Glenn, Alfy Samuel, Anoop Kumar, Daben Liu

arXiv:2510.15345v14.91 citationsh-index: 2

Originality Incremental advance

AI Analysis

This work addresses the problem of inconsistent readability definitions and metrics for researchers and practitioners in natural language processing, offering an incremental improvement by highlighting model-based approaches.

The study analyzed human readability judgments and found that information content and topic significantly influence comprehensibility beyond surface-level cues, and it showed that model-based metrics outperform traditional ones, with the best traditional metric averaging a rank of 8.6 compared to model-based metrics consistently ranking in the top four.

Automatic readability assessment plays a key role in ensuring effective and accessible written communication. Despite significant progress, the field is hindered by inconsistent definitions of readability and measurements that rely on surface-level text properties. In this work, we investigate the factors shaping human perceptions of readability through the analysis of 897 judgments, finding that, beyond surface-level cues, information content and topic strongly shape text comprehensibility. Furthermore, we evaluate 15 popular readability metrics across five English datasets, contrasting them with six more nuanced, model-based metrics. Our results show that four model-based metrics consistently place among the top four in rank correlations with human judgments, while the best performing traditional metric achieves an average rank of 8.6. These findings highlight a mismatch between current readability metrics and human perceptions, pointing to model-based approaches as a more promising direction.

View on arXiv PDF

Similar