CLCYMar 8, 2023

ChatGPT Participates in a Computer Science Exam

arXiv:2303.09461v248 citationsh-index: 45Has Code
AI Analysis

This incremental work assesses AI performance on academic tasks, highlighting potential but not proving understanding, relevant for educators and AI researchers.

The study tested ChatGPT on an undergraduate computer science exam, finding it narrowly passed with 20.5 out of 40 points, and GPT-4 improved performance by 17% to match average student scores.

We asked ChatGPT to participate in an undergraduate computer science exam on ''Algorithms and Data Structures''. The program was evaluated on the entire exam as posed to the students. We hand-copied its answers onto an exam sheet, which was subsequently graded in a blind setup alongside those of 200 participating students. We find that ChatGPT narrowly passed the exam, obtaining 20.5 out of 40 points. This impressive performance indicates that ChatGPT can indeed succeed in challenging tasks like university exams. At the same time, the questions in our exam are structurally similar to those of other exams, solved homework problems, and teaching materials that can be found online and might have been part of ChatGPT's training data. Therefore, it would be inadequate to conclude from this experiment that ChatGPT has any understanding of computer science. We also assess the improvements brought by GPT-4. We find that GPT-4 would have obtained about 17\% more exam points than GPT-3.5, reaching the performance of the average student. The transcripts of our conversations with ChatGPT are available at \url{https://github.com/tml-tuebingen/chatgpt-algorithm-exam}, and the entire graded exam is in the appendix of this paper.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes