CLAIOct 17, 2025

Readability Reconsidered: A Cross-Dataset Analysis of Reference-Free Metrics

arXiv:2510.15345v11 citationsh-index: 2
AI Analysis

This work addresses the problem of inconsistent readability definitions and metrics for researchers and practitioners in natural language processing, offering an incremental improvement by highlighting model-based approaches.

The study analyzed human readability judgments and found that information content and topic significantly influence comprehensibility beyond surface-level cues, and it showed that model-based metrics outperform traditional ones, with the best traditional metric averaging a rank of 8.6 compared to model-based metrics consistently ranking in the top four.

Automatic readability assessment plays a key role in ensuring effective and accessible written communication. Despite significant progress, the field is hindered by inconsistent definitions of readability and measurements that rely on surface-level text properties. In this work, we investigate the factors shaping human perceptions of readability through the analysis of 897 judgments, finding that, beyond surface-level cues, information content and topic strongly shape text comprehensibility. Furthermore, we evaluate 15 popular readability metrics across five English datasets, contrasting them with six more nuanced, model-based metrics. Our results show that four model-based metrics consistently place among the top four in rank correlations with human judgments, while the best performing traditional metric achieves an average rank of 8.6. These findings highlight a mismatch between current readability metrics and human perceptions, pointing to model-based approaches as a more promising direction.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes