CLAIJun 3, 2025

EssayBench: Evaluating Large Language Models in Multi-Genre Chinese Essay Writing

arXiv:2506.02596v13 citationsh-index: 9
Originality Incremental advance
AI Analysis

This addresses the problem of evaluating LLMs in educational contexts for Chinese essay writing, though it is incremental as it builds on existing benchmarking efforts.

The authors tackled the lack of robust evaluation for Large Language Models in Chinese essay writing by creating EssayBench, a multi-genre benchmark with 728 prompts and a fine-grained scoring framework, and they benchmarked 15 LLMs to analyze their performance across genres.

Chinese essay writing and its evaluation are critical in educational contexts, yet the capabilities of Large Language Models (LLMs) in this domain remain largely underexplored. Existing benchmarks often rely on coarse-grained text quality metrics, largely overlooking the structural and rhetorical complexities of Chinese essays, particularly across diverse genres. To address this gap, we propose \benchName, a multi-genre benchmark specifically designed for Chinese essay writing across four major genres: Argumentative, Narrative, Descriptive, and Expository. We curate and refine a total of 728 real-world prompts to ensure authenticity and meticulously categorize them into the \textit{Open-Ended} and \textit{Constrained} sets to capture diverse writing scenarios. To reliably evaluate generated essays, we develop a fine-grained, genre-specific scoring framework that hierarchically aggregates scores. We further validate our evaluation protocol through a comprehensive human agreement study. Finally, we benchmark 15 large-sized LLMs, analyzing their strengths and limitations across genres and instruction types. With \benchName, we aim to advance LLM-based Chinese essay evaluation and inspire future research on improving essay generation in educational settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes