Mohammad Hassany

h-index3

6papers

21citations

Novelty49%

AI Score40

Ranked #71,046 of 194,257 authors (top 37%)#471 in HC (top 19%)

6 Papers

4.7SESep 22, 2024

Evaluating the Quality of Code Comments Generated by Large Language Models for Novice Programmers

Aysa Xuemo Fan, Arun Balajiee Lekshmi Narayanan, Mohammad Hassany et al.

Large Language Models (LLMs) show promise in generating code comments for novice programmers, but their educational effectiveness remains under-evaluated. This study assesses the instructional quality of code comments produced by GPT-4, GPT-3.5-Turbo, and Llama2, compared to expert-developed comments, focusing on their suitability for novices. Analyzing a dataset of ``easy'' level Java solutions from LeetCode, we find that GPT-4 exhibits comparable quality to expert comments in aspects critical for beginners, such as clarity, beginner-friendliness, concept elucidation, and step-by-step guidance. GPT-4 outperforms Llama2 in discussing complexity (chi-square = 11.40, p = 0.001) and is perceived as significantly more supportive for beginners than GPT-3.5 and Llama2 with Mann-Whitney U-statistics = 300.5 and 322.5, p = 0.0017 and 0.0003). This study highlights the potential of LLMs for generating code comments tailored to novice programmers.

8.3HCMay 20

Exploring the Effectiveness of Using LLMs for Automated Assessment of Student Self Explanations in Programming Education

Arun-Balajiee Lekshmi-Narayanan, Mohammad Hassany, Peter Brusilovsky

Worked examples are step-by-step solutions to problems in a specific domain, offered to students to acquire domain-specific problem-solving skills. The effectiveness of worked examples could be enhanced by combining them with self-explanations, which ask students to explain rather than passively study each problem-solving step. The main challenge of this approach is assessing the correctness of the student's explanations. In the prevailing approach, student explanations are judged by their semantic similarity to an instructor's or domain expert's explanation. Given recent advances in LLM-based automated scoring, it remains unclear whether semantic similarity methods are still the most effective technique to automatically score textual student responses like essays or code explanations. Comparing these methods also requires quality datasets that offer distinctive features such as balanced class distributions and domain-specific labeled data for automated scoring tasks. In this paper, we present a rigorous comparison between LLMs and semantic similarity used for automated scoring, framed as a binary classification task.

6.7HCFeb 26, 2024

Human-AI Co-Creation of Worked Examples for Programming Classes

Mohammad Hassany, Peter Brusilovsky, Jiaze Ke et al.

Worked examples (solutions to typical programming problems presented as a source code in a certain language and are used to explain the topics from a programming class) are among the most popular types of learning content in programming classes. Most approaches and tools for presenting these examples to students are based on line-by-line explanations of the example code. However, instructors rarely have time to provide line-by-line explanations for a large number of examples typically used in a programming class. In this paper, we explore and assess a human-AI collaboration approach to authoring worked examples for Java programming. We introduce an authoring system for creating Java worked examples that generates a starting version of code explanations and presents it to the instructor to edit if necessary.We also present a study that assesses the quality of explanations created with this approach

5.5HCDec 4, 2023

Authoring Worked Examples for Java Programming with Human-AI Collaboration

Mohammad Hassany, Peter Brusilovsky, Jiaze Ke et al.

Worked examples (solutions to typical programming problems presented as a source code in a certain language and are used to explain the topics from a programming class) are among the most popular types of learning content in programming classes. Most approaches and tools for presenting these examples to students are based on line-by-line explanations of the example code. However, instructors rarely have time to provide line-by-line explanations for a large number of examples typically used in a programming class. In this paper, we explore and assess a human-AI collaboration approach to authoring worked examples for Java programming. We introduce an authoring system for creating Java worked examples that generates a starting version of code explanations and presents it to the instructor to edit if necessary. We also present a study that assesses the quality of explanations created with this approach.

CYJun 9

Automated Recommendation of Programming Learning Content Using Pattern-based Knowledge Components

Muntasir Hoq, Griffin Pitts, Zhangqi Duan et al.

Introductory programming instruction relies on hands-on practice and short learning activities to support mastery of foundational concepts. Although many such learning resources exist, organizing and linking these items in instructionally meaningful ways is challenging without time-intensive expert curation. This study investigates the use of pattern-based Knowledge Components (KCs) to automatically identify code-based learning resources targeting similar concepts. In our approach, pattern-based KCs are extracted from each code sample, and related activities are identified by measuring similarity between the KC sets associated with each activity. By leveraging alignment at the level of semantically important programming patterns, this method supports contextually appropriate and pedagogically useful recommendations. We evaluate our approach on an expert-organized corpus of introductory Python materials in which instructors grouped items into bundles based on conceptual similarity. Results show that our pattern-based KC approach retrieves resources that align with this expert organization, and outperformed representative KC- and embedding-based baselines across standard ranking evaluations. Overall, the framework supports targeted, concept-oriented guidance for programming learners and can help instructors organize, bundle, and recommend instructional content at scale.

7.8AIFeb 25, 2025Code

Automated Knowledge Component Generation for Interpretable Knowledge Tracing in Coding Problems

Zhangqi Duan, Nigel Fernandez, Arun Balajiee Lekshmi Narayanan et al.

Knowledge components (KCs) mapped to problems help model student learning, tracking their mastery levels on fine-grained skills thereby facilitating personalized learning and feedback in online learning platforms. However, crafting and tagging KCs to problems, traditionally performed by human domain experts, is highly labor intensive. We present an automated, LLM-based pipeline for KC generation and tagging for open-ended programming problems. We also develop an LLM-based knowledge tracing (KT) framework to leverage these LLM-generated KCs, which we refer to as KCGen-KT. We conduct extensive quantitative and qualitative evaluations on two real-world student code submission datasets in different programming languages.We find that KCGen-KT outperforms existing KT methods and human-written KCs on future student response prediction. We investigate the learning curves of generated KCs and show that LLM-generated KCs result in a better fit than human written KCs under a cognitive model. We also conduct a human evaluation with course instructors to show that our pipeline generates reasonably accurate problem-KC mappings.