CLFeb 3

Automatic Classification of Pedagogical Materials against CS Curriculum Guidelines

Erik Saule, Kalpathi Subramanian, Razvan Bunescu

arXiv:2602.03962v10.6

Originality Synthesis-oriented

AI Analysis

This addresses a practical issue for CS program administrators by automating curriculum auditing, though it is incremental as it applies existing NLP methods to a new domain.

The paper tackles the problem of assessing how much Computer Science curriculum guidelines are covered by a program, which is time-consuming and demanding for administrators, by using Natural Language Processing techniques to automatically classify pedagogical materials, showing that meaningful classification can be achieved.

Professional societies often publish curriculum guidelines to help programs align their content to international standards. In Computer Science, the primary standard is published by ACM and IEEE and provide detailed guidelines for what should be and could be included in a Computer Science program. While very helpful, it remains difficult for program administrators to assess how much of the guidelines is being covered by a CS program. This is in particular due to the extensiveness of the guidelines, containing thousands of individual items. As such, it is time consuming and cognitively demanding to audit every course to confidently mark everything that is actually being covered. Our preliminary work indicated that it takes about a day of work per course. In this work, we propose using Natural Language Processing techniques to accelerate the process. We explore two kinds of techniques, the first relying on traditional tools for parsing, tagging, and embeddings, while the second leverages the power of Large Language Models. We evaluate the application of these techniques to classify a corpus of pedagogical materials and show that we can meaningfully classify documents automatically.

View on arXiv PDF

Similar