Estimating Difficulty Levels of Programming Problems with Pre-trained Model
This addresses the need for adaptive learning in programming education by reducing reliance on expert annotations or extensive student solution data.
The paper tackles the problem of automatically estimating difficulty levels for programming problems on online judge platforms by coupling pre-trained models for text and code modalities, achieving effective results on two built datasets.
As the demand for programming skills grows across industries and academia, students often turn to Programming Online Judge (POJ) platforms for coding practice and competition. The difficulty level of each programming problem serves as an essential reference for guiding students' adaptive learning. However, current methods of determining difficulty levels either require extensive expert annotations or take a long time to accumulate enough student solutions for each problem. To address this issue, we formulate the problem of automatic difficulty level estimation of each programming problem, given its textual description and a solution example of code. For tackling this problem, we propose to couple two pre-trained models, one for text modality and the other for code modality, into a unified model. We built two POJ datasets for the task and the results demonstrate the effectiveness of the proposed approach and the contributions of both modalities.