CRAISEApr 12, 2023

Evaluation of ChatGPT Model for Vulnerability Detection

arXiv:2304.07232v188 citationsh-index: 20
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of assessing AI models for cybersecurity tasks, but it is incremental as it applies existing models to a new dataset with negative results.

The study evaluated ChatGPT and GPT-3 for vulnerability detection in code using a real-world dataset with binary and multi-label classification tasks, finding that ChatGPT performed no better than a dummy classifier.

In this technical report, we evaluated the performance of the ChatGPT and GPT-3 models for the task of vulnerability detection in code. Our evaluation was conducted on our real-world dataset, using binary and multi-label classification tasks on CWE vulnerabilities. We decided to evaluate the model because it has shown good performance on other code-based tasks, such as solving programming challenges and understanding code at a high level. However, we found that the ChatGPT model performed no better than a dummy classifier for both binary and multi-label classification tasks for code vulnerability detection.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes