AICLCYAug 21, 2023

Using Large Language Models for Cybersecurity Capture-The-Flag Challenges and Certification Questions

arXiv:2308.10443v150 citationsh-index: 35
Originality Synthesis-oriented
AI Analysis

This addresses the problem of academic integrity in cybersecurity education due to freely available AI tools, but it is incremental as it evaluates existing models on new data without proposing novel solutions.

The research investigated the effectiveness of large language models (LLMs) like ChatGPT, Bard, and Bing in solving cybersecurity Capture-The-Flag challenges and certification questions, finding they can perform well and raise academic integrity concerns.

The assessment of cybersecurity Capture-The-Flag (CTF) exercises involves participants finding text strings or ``flags'' by exploiting system vulnerabilities. Large Language Models (LLMs) are natural-language models trained on vast amounts of words to understand and generate text; they can perform well on many CTF challenges. Such LLMs are freely available to students. In the context of CTF exercises in the classroom, this raises concerns about academic integrity. Educators must understand LLMs' capabilities to modify their teaching to accommodate generative AI assistance. This research investigates the effectiveness of LLMs, particularly in the realm of CTF challenges and questions. Here we evaluate three popular LLMs, OpenAI ChatGPT, Google Bard, and Microsoft Bing. First, we assess the LLMs' question-answering performance on five Cisco certifications with varying difficulty levels. Next, we qualitatively study the LLMs' abilities in solving CTF challenges to understand their limitations. We report on the experience of using the LLMs for seven test cases in all five types of CTF challenges. In addition, we demonstrate how jailbreak prompts can bypass and break LLMs' ethical safeguards. The paper concludes by discussing LLM's impact on CTF exercises and its implications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes