Benefits and Risks of Using ChatGPT4 as a Teaching Assistant for Computer Science Students
This addresses the potential and limitations of using AI chatbots as teaching assistants for computer science education, highlighting risks in specialized domains.
The study evaluated ChatGPT3.5's ability to assist computer science students by testing its correctness on questions ranging from basic algorithms to quantum computing, finding that performance drastically declines with domain specialization, from near-perfect accuracy in basic algorithms to frequent errors in quantum computing.
Upon release, ChatGPT3.5 shocked the software engineering community by its ability to generate answers to specialized questions about coding. Immediately, many educators wondered if it was possible to use the chatbot as a support tool that helps students answer their programming questions. This article evaluates this possibility at three levels: fundamental Computer Science knowledge (basic algorithms and data structures), core competency (design patterns), and advanced knowledge (quantum computing). In each case, we ask normalized questions several times to ChatGPT3.5, then look at the correctness of answers, and finally check if this creates issues. The main result is that the performances of ChatGPT3.5 degrades drastically as the specialization of the domain increases: for basic algorithms it returns answers that are almost always correct, for design patterns the generated code contains many code smells and is generally of low quality, but it is still sometimes able to fix it (if asked), and for quantum computing it is often blatantly wrong.