Free-text Rationale Generation under Readability Level Control
This work addresses the challenge of making AI explanations accessible by controlling readability, though it is incremental as it builds on existing rationale generation methods.
The study investigated how large language models generate free-text rationales when prompted to target specific readability levels, finding that while explanations adapt to instructions, the distinction between levels does not fully align with traditional metrics, and high-school-level readability was most favored by human annotators.
Free-text rationales justify model decisions in natural language and thus become likable and accessible among approaches to explanation across many tasks. However, their effectiveness can be hindered by misinterpretation and hallucination. As a perturbation test, we investigate how large language models (LLMs) perform rationale generation under the effects of readability level control, i.e., being prompted for an explanation targeting a specific expertise level, such as sixth grade or college. We find that explanations are adaptable to such instruction, though the observed distinction between readability levels does not fully match the defined complexity scores according to traditional readability metrics. Furthermore, the generated rationales tend to feature medium level complexity, which correlates with the measured quality using automatic metrics. Finally, our human annotators confirm a generally satisfactory impression on rationales at all readability levels, with high-school-level readability being most commonly perceived and favored.