CLOct 12, 2024

AERA Chat: An Interactive Platform for Automated Explainable Student Answer Assessment

arXiv:2410.09507v21 citationsh-index: 11EMNLP
Originality Synthesis-oriented
AI Analysis

This addresses the problem of unreliable rationales in automated scoring for educators and researchers, though it is incremental as it builds on existing LLM methods with new visualization and annotation tools.

The paper tackles the challenge of generating high-quality assessment rationales in automated student answer scoring by presenting AERA Chat, an interactive visualization platform that leverages multiple LLMs to score answers and generate rationales, with evaluations on several datasets demonstrating its capability for robust rationale evaluation and comparative analysis.

Explainability in automated student answer scoring systems is critical for building trust and enhancing usability among educators. Yet, generating high-quality assessment rationales remains challenging due to the scarcity of annotated data and the prohibitive cost of manual verification, prompting heavy reliance on rationales produced by large language models (LLMs), which are often noisy and unreliable. To address these limitations, we present AERA Chat, an interactive visualization platform designed for automated explainable student answer assessment. AERA Chat leverages multiple LLMs to concurrently score student answers and generate explanatory rationales, offering innovative visualization features that highlight critical answer components and rationale justifications. The platform also incorporates intuitive annotation and evaluation tools, supporting educators in marking tasks and researchers in evaluating rationale quality from different models. We demonstrate the effectiveness of our platform through evaluations of multiple rationale-generation methods on several datasets, showcasing its capability for facilitating robust rationale evaluation and comparative analysis.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes