CY AIFeb 24

Can AI be a Teaching Partner? Evaluating ChatGPT, Gemini, and DeepSeek across Three Teaching Strategies

Talita de Paula Cypriano de Souza, Shruti Mehta, Matheus Arataque Uema, Luciano Bernardes de Paula, Seiji Isotani

arXiv:2603.26673h-index: 37

AI Analysis

This addresses the need for empirical evidence on LLMs' pedagogical skills for students, though it is incremental as it evaluates existing models on new teaching strategies.

The study compared ChatGPT, Gemini, and DeepSeek as teaching agents for C programming, finding that ChatGPT and Gemini scored higher than DeepSeek, with models showing similar patterns in Examples and Explanations but greater sensitivity to prompts in the Socratic Method.

There are growing promises that Large Language Models (LLMs) can support students' learning by providing explanations, feedback, and guidance. However, despite their rapid adoption and widespread attention, there is still limited empirical evidence regarding the pedagogical skills of LLMs. This article presents a comparative study of popular LLMs, namely, ChatGPT, DeepSeek, and Gemini, acting as teaching agents. An evaluation protocol was developed, focusing on three pedagogical strategies: Examples, Explanations and Analogies, and the Socratic Method. Six human judges conducted the evaluations in the context of teaching the C programming language to beginners. The results indicate that LLM models exhibited similar interaction patterns in the pedagogical strategies of Examples and Explanations and Analogies. In contrast, for the Socratic Method, the models showed greater sensitivity to the pedagogical strategy and the initial prompt. Overall, ChatGPT and Gemini received higher scores, whereas DeepSeek obtained lower scores across the criteria, indicating differences in pedagogical performance across models.

View on arXiv PDF

Similar