CL AIOct 12, 2025

Toward Human-Centered Readability Evaluation

arXiv:2510.10801v11 citationsh-index: 6Proceedings of the Fourth Workshop on Bridging Human-Computer Interaction and Natural Language Processing (HCI+NLP)

Originality Incremental advance

AI Analysis

This addresses the need for more usable and respectful health communication for diverse populations, including those with limited health literacy, though it is incremental as it builds on existing HCI and health communication research.

The paper tackles the problem that existing NLP metrics for text simplification, like BLEU and SARI, fail to capture human-centered qualities such as clarity and trustworthiness, especially in health contexts, by proposing the Human-Centered Readability Score (HCRS) framework to integrate automatic measures with human feedback for better evaluation.

Text simplification is essential for making public health information accessible to diverse populations, including those with limited health literacy. However, commonly used evaluation metrics in Natural Language Processing (NLP), such as BLEU, FKGL, and SARI, mainly capture surface-level features and fail to account for human-centered qualities like clarity, trustworthiness, tone, cultural relevance, and actionability. This limitation is particularly critical in high-stakes health contexts, where communication must be not only simple but also usable, respectful, and trustworthy. To address this gap, we propose the Human-Centered Readability Score (HCRS), a five-dimensional evaluation framework grounded in Human-Computer Interaction (HCI) and health communication research. HCRS integrates automatic measures with structured human feedback to capture the relational and contextual aspects of readability. We outline the framework, discuss its integration into participatory evaluation workflows, and present a protocol for empirical validation. This work aims to advance the evaluation of health text simplification beyond surface metrics, enabling NLP systems that align more closely with diverse users' needs, expectations, and lived experiences.

View on arXiv PDF

Similar