CL CY LGJul 3, 2024

RDBE: Reasoning Distillation-Based Evaluation Enhances Automatic Essay Scoring

arXiv:2407.13781v11.0h-index: 1

Originality Incremental advance

AI Analysis

This provides interpretable and high-performing essay scoring for educational applications, though it is incremental as it builds on existing distillation and scoring methods.

The paper tackles the lack of interpretability in automatic essay scoring by introducing RDBE, which uses reasoning from a large language model to distill a small model, resulting in state-of-the-art performance across all scoring rubrics in the dataset.

Recently, various encoder-only and encoder-decoder pre-trained models like BERT and T5 have been applied to automatic essay scoring (AES) as small language models. However, existing studies have primarily treated this task akin to a classification problem, focusing solely on outputting scores in the target text without offering interpretations for the generated scores. Departing from the approaches, we introduce Reasoning Distillation-Based Evaluation (RDBE), which integrates interpretability to elucidate the rationale behind model scores while enhancing performance through initial reasoning. This interpretive capability is acquired during training by leveraging generated reasoning from a large language model (LLM) to distill a small language model (SLM). Our experimental results demonstrate the efficacy of RDBE across all scoring rubrics considered in the dataset. RDBE outperforms both zero-shot LLM generation and generation from a baseline fine-tuned model, establishing itself as state-of-the-art in the corresponding dataset. This highlights its practical interpretative output and enhanced performance.

View on arXiv PDF

Similar