Reducing the Cost: Cross-Prompt Pre-Finetuning for Short Answer Scoring
This work addresses the resource constraints in educational settings like schools and online courses by reducing the need for new data and training for each prompt, though it is incremental as it builds on existing fine-tuning methods.
The paper tackles the high cost of training separate models for each new prompt in automated short answer scoring by proposing a two-phase approach that pre-trains on existing prompts using key phrases and then fine-tunes on new prompts, resulting in significantly improved scoring accuracy, especially with limited training data.
Automated Short Answer Scoring (SAS) is the task of automatically scoring a given input to a prompt based on rubrics and reference answers. Although SAS is useful in real-world applications, both rubrics and reference answers differ between prompts, thus requiring a need to acquire new data and train a model for each new prompt. Such requirements are costly, especially for schools and online courses where resources are limited and only a few prompts are used. In this work, we attempt to reduce this cost through a two-phase approach: train a model on existing rubrics and answers with gold score signals and finetune it on a new prompt. Specifically, given that scoring rubrics and reference answers differ for each prompt, we utilize key phrases, or representative expressions that the answer should contain to increase scores, and train a SAS model to learn the relationship between key phrases and answers using already annotated prompts (i.e., cross-prompts). Our experimental results show that finetuning on existing cross-prompt data with key phrases significantly improves scoring accuracy, especially when the training data is limited. Finally, our extensive analysis shows that it is crucial to design the model so that it can learn the task's general property.