CYLGJan 15, 2021

Automating Program Structure Classification

arXiv:2101.10087v13 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the need for better tools to assist teachers in exploring student solutions, though it is incremental as it applies existing methods to a new educational domain.

The paper tackled the problem of manually analyzing student program structures by developing supervised machine learning models to automatically classify them, achieving 91% accuracy on a dataset of 108 programs.

When students write programs, their program structure provides insight into their learning process. However, analyzing program structure by hand is time-consuming, and teachers need better tools for computer-assisted exploration of student solutions. As a first step towards an education-oriented program analysis toolkit, we show how supervised machine learning methods can automatically classify student programs into a predetermined set of high-level structures. We evaluate two models on classifying student solutions to the Rainfall problem: a nearest-neighbors classifier using syntax tree edit distance and a recurrent neural network. We demonstrate that these models can achieve 91% classification accuracy when trained on 108 programs. We further explore the generality, trade-offs, and failure cases of each model.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes