Valdemar Švábenský

CR
h-index19
23papers
255citations
Novelty24%
AI Score34

23 Papers

CRJul 13, 2023
Student Assessment in Cybersecurity Training Automated by Pattern Mining and Clustering

Valdemar Švábenský, Jan Vykopal, Pavel Čeleda et al.

Hands-on cybersecurity training allows students and professionals to practice various tools and improve their technical skills. The training occurs in an interactive learning environment that enables completing sophisticated tasks in full-fledged operating systems, networks, and applications. During the training, the learning environment allows collecting data about trainees' interactions with the environment, such as their usage of command-line tools. These data contain patterns indicative of trainees' learning processes, and revealing them allows to assess the trainees and provide feedback to help them learn. However, automated analysis of these data is challenging. The training tasks feature complex problem-solving, and many different solution approaches are possible. Moreover, the trainees generate vast amounts of interaction data. This paper explores a dataset from 18 cybersecurity training sessions using data mining and machine learning techniques. We employed pattern mining and clustering to analyze 8834 commands collected from 113 trainees, revealing their typical behavior, mistakes, solution strategies, and difficult training stages. Pattern mining proved suitable in capturing timing information and tool usage frequency. Clustering underlined that many trainees often face the same issues, which can be addressed by targeted scaffolding. Our results show that data mining methods are suitable for analyzing cybersecurity training data. Educational researchers and practitioners can apply these methods in their contexts to assess trainees, support them, and improve the training design. Artifacts associated with this research are publicly available.

CLJul 30, 2024
Comparison of Large Language Models for Generating Contextually Relevant Questions

Ivo Lodovico Molina, Valdemar Švábenský, Tsubasa Minematsu et al.

This study explores the effectiveness of Large Language Models (LLMs) for Automatic Question Generation in educational settings. Three LLMs are compared in their ability to create questions from university slide text without fine-tuning. Questions were obtained in a two-step pipeline: first, answer phrases were extracted from slides using Llama 2-Chat 13B; then, the three models generated questions for each answer. To analyze whether the questions would be suitable in educational applications for students, a survey was conducted with 46 students who evaluated a total of 246 questions across five metrics: clarity, relevance, difficulty, slide relation, and question-answer alignment. Results indicate that GPT-3.5 and Llama 2-Chat 13B outperform Flan T5 XXL by a small margin, particularly in terms of clarity and question-answer alignment. GPT-3.5 especially excels at tailoring questions to match the input answers. The contribution of this research is the analysis of the capacity of LLMs for Automatic Question Generation in education.

LGAug 16, 2024
Detecting Unsuccessful Students in Cybersecurity Exercises in Two Different Learning Environments

Valdemar Švábenský, Kristián Tkáčik, Aubrey Birdwell et al.

This full paper in the research track evaluates the usage of data logged from cybersecurity exercises in order to predict students who are potentially at risk of performing poorly. Hands-on exercises are essential for learning since they enable students to practice their skills. In cybersecurity, hands-on exercises are often complex and require knowledge of many topics. Therefore, students may miss solutions due to gaps in their knowledge and become frustrated, which impedes their learning. Targeted aid by the instructor helps, but since the instructor's time is limited, efficient ways to detect struggling students are needed. This paper develops automated tools to predict when a student is having difficulty. We formed a dataset with the actions of 313 students from two countries and two learning environments: KYPO CRP and EDURange. These data are used in machine learning algorithms to predict the success of students in exercises deployed in these environments. After extracting features from the data, we trained and cross-validated eight classifiers for predicting the exercise outcome and evaluated their predictive power. The contribution of this paper is comparing two approaches to feature engineering, modeling, and classification performance on data from two learning environments. Using the features from either learning environment, we were able to detect and distinguish between successful and struggling students. A decision tree classifier achieved the highest balanced accuracy and sensitivity with data from both learning environments. The results show that activity data from cybersecurity exercises are suitable for predicting student success. In a potential application, such models can aid instructors in detecting struggling students and providing targeted help. We publish data and code for building these models so that others can adopt or adapt them.

LGJul 14, 2023
Towards Generalizable Detection of Urgency of Discussion Forum Posts

Valdemar Švábenský, Ryan S. Baker, Andrés Zambrano et al.

Students who take an online course, such as a MOOC, use the course's discussion forum to ask questions or reach out to instructors when encountering an issue. However, reading and responding to students' questions is difficult to scale because of the time needed to consider each message. As a result, critical issues may be left unresolved, and students may lose the motivation to continue in the course. To help address this problem, we build predictive models that automatically determine the urgency of each forum post, so that these posts can be brought to instructors' attention. This paper goes beyond previous work by predicting not just a binary decision cut-off but a post's level of urgency on a 7-point scale. First, we train and cross-validate several models on an original data set of 3,503 posts from MOOCs at University of Pennsylvania. Second, to determine the generalizability of our models, we test their performance on a separate, previously published data set of 29,604 posts from MOOCs at Stanford University. While the previous work on post urgency used only one data set, we evaluated the prediction across different data sets and courses. The best-performing model was a support vector regressor trained on the Universal Sentence Encoder embeddings of the posts, achieving an RMSE of 1.1 on the training set and 1.4 on the test set. Understanding the urgency of forum posts enables instructors to focus their time more effectively and, as a result, better support student learning.

CYJan 4, 2022Code
Preventing Cheating in Hands-on Lab Assignments

Jan Vykopal, Valdemar Švábenský, Pavel Seda et al.

Networking, operating systems, and cybersecurity skills are exercised best in an authentic environment. Students work with real systems and tools in a lab environment and complete assigned tasks. Since all students typically receive the same assignment, they can consult their approach and progress with an instructor, a tutoring system, or their peers. They may also search for information on the Internet. Having the same assignment for all students in class is standard practice efficient for learning and developing skills. However, it is prone to cheating when used in a summative assessment such as graded homework, a mid-term test, or a final exam. Students can easily share and submit correct answers without completing the assignment. In this paper, we discuss methods for automatic problem generation for hands-on tasks completed in a computer lab environment. Using this approach, each student receives personalized tasks. We developed software for generating and submitting these personalized tasks and conducted a case study. The software was used for creating and grading a homework assignment in an introductory security course enrolled by 207 students. The software revealed seven cases of suspicious submissions, which may constitute cheating. In addition, students and instructors welcomed the personalized assignments. Instructors commented that this approach scales well for large classes. Students rarely encountered issues while running their personalized lab environment. Finally, we have released the open-source software to enable other educators to use it in their courses and learning environments.

CRDec 21, 2021Code
Toolset for Collecting Shell Commands and Its Application in Hands-on Cybersecurity Training

Valdemar Švábenský, Jan Vykopal, Daniel Tovarňák et al.

When learning cybersecurity, operating systems, or networking, students perform practical tasks using a broad range of command-line tools. Collecting and analyzing data about the command usage can reveal valuable insights into how students progress and where they make mistakes. However, few learning environments support recording and inspecting command-line inputs, and setting up an efficient infrastructure for this purpose is challenging. To aid engineering and computing educators, we share the design and implementation of an open-source toolset for logging commands that students execute on Linux machines. Compared to basic solutions, such as shell history files, the toolset's added value is threefold. 1) Its configuration is automated so that it can be easily used in classes on different topics. 2) It collects metadata about the command execution, such as a timestamp, hostname, and IP address. 3) Data are instantly forwarded to central storage in a unified, semi-structured format. This enables automated processing, both in real-time and post hoc, to enhance the instructors' understanding of student actions. The toolset works independently of the teaching content, the training network's topology, or the number of students working in parallel. We demonstrated the toolset's value in two learning environments at four training sessions. Over two semesters, 50 students played educational cybersecurity games using a Linux command-line interface. Each training session lasted approximately two hours, during which we recorded 4439 shell commands. The semi-automated data analysis revealed solution patterns, used tools, and misconceptions of students. Our insights from creating the toolset and applying it in teaching practice are relevant for instructors, researchers, and developers of learning environments. We provide the software and data resulting from this work so that others can use them.

CYDec 3, 2021Code
Evaluating Two Approaches to Assessing Student Progress in Cybersecurity Exercises

Valdemar Švábenský, Richard Weiss, Jack Cook et al.

Cybersecurity students need to develop practical skills such as using command-line tools. Hands-on exercises are the most direct way to assess these skills, but assessing students' mastery is a challenging task for instructors. We aim to alleviate this issue by modeling and visualizing student progress automatically throughout the exercise. The progress is summarized by graph models based on the shell commands students typed to achieve discrete tasks within the exercise. We implemented two types of models and compared them using data from 46 students at two universities. To evaluate our models, we surveyed 22 experienced computing instructors and qualitatively analyzed their responses. The majority of instructors interpreted the graph models effectively and identified strengths, weaknesses, and assessment use cases for each model. Based on the evaluation, we provide recommendations to instructors and explain how our graph models innovate teaching and promote further research. The impact of this paper is threefold. First, it demonstrates how multiple institutions can collaborate to share approaches to modeling student progress in hands-on exercises. Second, our modeling techniques generalize to data from different environments to support student assessment, even outside the cybersecurity domain. Third, we share the acquired data and open-source software so that others can use the models in their classes or research.

CRApr 24, 2020Code
KYPO4INDUSTRY: A Testbed for Teaching Cybersecurity of Industrial Control Systems

Pavel Čeleda, Jan Vykopal, Valdemar Švábenský et al.

There are different requirements on cybersecurity of industrial control systems and information technology systems. This fact exacerbates the global issue of hiring cybersecurity employees with relevant skills. In this paper, we present KYPO4INDUSTRY training facility and a course syllabus for beginner and intermediate computer science students to learn cybersecurity in a simulated industrial environment. The training facility is built using open-source hardware and software and provides reconfigurable modules of industrial control systems. The course uses a flipped classroom format with hands-on projects: the students create educational games that replicate real cyber attacks. Throughout the semester, they learn to understand the risks and gain capabilities to respond to cyber attacks that target industrial control systems. Our described experience from the design of the testbed and its usage can help any educator interested in teaching cybersecurity of cyber-physical systems.

LGDec 3, 2024
Evaluating the Impact of Data Augmentation on Predictive Model Performance

Valdemar Švábenský, Conrad Borchers, Elizabeth B. Cloude et al.

In supervised machine learning (SML) research, large training datasets are essential for valid results. However, obtaining primary data in learning analytics (LA) is challenging. Data augmentation can address this by expanding and diversifying data, though its use in LA remains underexplored. This paper systematically compares data augmentation techniques and their impact on prediction performance in a typical LA task: prediction of academic outcomes. Augmentation is demonstrated on four SML models, which we successfully replicated from a previous LAK study based on AUC values. Among 21 augmentation techniques, SMOTE-ENN sampling performed the best, improving the average AUC by 0.01 and approximately halving the training time compared to the baseline models. In addition, we compared 99 combinations of chaining 21 techniques, and found minor, although statistically significant, improvements across models when adding noise to SMOTE-ENN (+0.014). Notably, some augmentation techniques significantly lowered predictive performance or increased performance fluctuation related to random chance. This paper's contribution is twofold. Primarily, our empirical findings show that sampling techniques provide the most statistically reliable performance improvements for LA applications of SML, and are computationally more efficient than deep generation methods with complex hyperparameter settings. Second, the LA community may benefit from validating a recent study through independent replication.

LGApr 8, 2025
Single-Agent vs. Multi-Agent LLM Strategies for Automated Student Reflection Assessment

Gen Li, Li Chen, Cheng Tang et al.

We explore the use of Large Language Models (LLMs) for automated assessment of open-text student reflections and prediction of academic performance. Traditional methods for evaluating reflections are time-consuming and may not scale effectively in educational settings. In this work, we employ LLMs to transform student reflections into quantitative scores using two assessment strategies (single-agent and multi-agent) and two prompting techniques (zero-shot and few-shot). Our experiments, conducted on a dataset of 5,278 reflections from 377 students over three academic terms, demonstrate that the single-agent with few-shot strategy achieves the highest match rate with human evaluations. Furthermore, models utilizing LLM-assessed reflection scores outperform baselines in both at-risk student identification and grade prediction tasks. These findings suggest that LLMs can effectively automate reflection assessment, reduce educators' workload, and enable timely support for students who may need additional assistance. Our work emphasizes the potential of integrating advanced generative AI technologies into educational practices to enhance student engagement and academic success.

CYMay 24, 2024
E2Vec: Feature Embedding with Temporal Information for Analyzing Student Actions in E-Book Systems

Yuma Miyazaki, Valdemar Švábenský, Yuta Taniguchi et al.

Digital textbook (e-book) systems record student interactions with textbooks as a sequence of events called EventStream data. In the past, researchers extracted meaningful features from EventStream, and utilized them as inputs for downstream tasks such as grade prediction and modeling of student behavior. Previous research evaluated models that mainly used statistical-based features derived from EventStream logs, such as the number of operation types or access frequencies. While these features are useful for providing certain insights, they lack temporal information that captures fine-grained differences in learning behaviors among different students. This study proposes E2Vec, a novel feature representation method based on word embeddings. The proposed method regards operation logs and their time intervals for each student as a string sequence of characters and generates a student vector of learning activity features that incorporates time information. We applied fastText to generate an embedding vector for each of 305 students in a dataset from two years of computer science courses. Then, we investigated the effectiveness of E2Vec in an at-risk detection task, demonstrating potential for generalizability and performance.

LGMay 14, 2025
Ranking-Based At-Risk Student Prediction Using Federated Learning and Differential Features

Shunsuke Yoneda, Valdemar Švábenský, Gen Li et al.

Digital textbooks are widely used in various educational contexts, such as university courses and online lectures. Such textbooks yield learning log data that have been used in numerous educational data mining (EDM) studies for student behavior analysis and performance prediction. However, these studies have faced challenges in integrating confidential data, such as academic records and learning logs, across schools due to privacy concerns. Consequently, analyses are often conducted with data limited to a single school, which makes developing high-performing and generalizable models difficult. This study proposes a method that combines federated learning and differential features to address these issues. Federated learning enables model training without centralizing data, thereby preserving student privacy. Differential features, which utilize relative values instead of absolute values, enhance model performance and generalizability. To evaluate the proposed method, a model for predicting at-risk students was trained using data from 1,136 students across 12 courses conducted over 4 years, and validated on hold-out test data from 5 other courses. Experimental results demonstrated that the proposed method addresses privacy concerns while achieving performance comparable to that of models trained via centralized learning in terms of Top-n precision, nDCG, and PR-AUC. Furthermore, using differential features improved prediction performance across all evaluation datasets compared to non-differential approaches. The trained models were also applicable for early prediction, achieving high performance in detecting at-risk students in earlier stages of the semester within the validation datasets.

LGMay 16, 2024
Evaluating Algorithmic Bias in Models for Predicting Academic Performance of Filipino Students

Valdemar Švábenský, Mélina Verger, Maria Mercedes T. Rodrigo et al.

Algorithmic bias is a major issue in machine learning models in educational contexts. However, it has not yet been studied thoroughly in Asian learning contexts, and only limited work has considered algorithmic bias based on regional (sub-national) background. As a step towards addressing this gap, this paper examines the population of 5,986 students at a large university in the Philippines, investigating algorithmic bias based on students' regional background. The university used the Canvas learning management system (LMS) in its online courses across a broad range of domains. Over the period of three semesters, we collected 48.7 million log records of the students' activity in Canvas. We used these logs to train binary classification models that predict student grades from the LMS activity. The best-performing model reached AUC of 0.75 and weighted F1-score of 0.79. Subsequently, we examined the data for bias based on students' region. Evaluation using three metrics: AUC, weighted F1-score, and MADD showed consistent results across all demographic groups. Thus, no unfairness was observed against a particular student group in the grade predictions.

LGDec 19, 2024
Knowledge Distillation in RNN-Attention Models for Early Prediction of Student Performance

Sukrit Leelaluk, Cheng Tang, Valdemar Švábenský et al.

Educational data mining (EDM) is a part of applied computing that focuses on automatically analyzing data from learning contexts. Early prediction for identifying at-risk students is a crucial and widely researched topic in EDM research. It enables instructors to support at-risk students to stay on track, preventing student dropout or failure. Previous studies have predicted students' learning performance to identify at-risk students by using machine learning on data collected from e-learning platforms. However, most studies aimed to identify at-risk students utilizing the entire course data after the course finished. This does not correspond to the real-world scenario that at-risk students may drop out before the course ends. To address this problem, we introduce an RNN-Attention-KD (knowledge distillation) framework to predict at-risk students early throughout a course. It leverages the strengths of Recurrent Neural Networks (RNNs) in handling time-sequence data to predict students' performance at each time step and employs an attention mechanism to focus on relevant time steps for improved predictive accuracy. At the same time, KD is applied to compress the time steps to facilitate early prediction. In an empirical evaluation, RNN-Attention-KD outperforms traditional neural network models in terms of recall and F1-measure. For example, it obtained recall and F1-measure of 0.49 and 0.51 for Weeks 1--3 and 0.51 and 0.61 for Weeks 1--6 across all datasets from four years of a university course. Then, an ablation study investigated the contributions of different knowledge transfer methods (distillation objectives). We found that hint loss from the hidden layer of RNN and context vector loss from the attention module on RNN could enhance the model's prediction performance for identifying at-risk students. These results are relevant for EDM researchers employing deep learning models.

CYFeb 19
Open Datasets in Learning Analytics: Trends, Challenges, and Best PRACTICE

Valdemar Švábenský, Brendan Flanagan, Erwin Daniel López Zapata et al.

Open datasets play a crucial role in three research domains that intersect data science and education: learning analytics, educational data mining, and artificial intelligence in education. Researchers in these domains apply computational methods to analyze data from educational contexts, aiming to better understand and improve teaching and learning. Providing open datasets alongside research papers supports reproducibility, collaboration, and trust in research findings. It also provides individual benefits for authors, such as greater visibility, credibility, and citation potential. Despite these advantages, the availability of open datasets and the associated practices within the learning analytics research communities, especially at their flagship conference venues, remain unclear. We surveyed available datasets published alongside research papers in learning analytics. We manually examined 1,125 papers from three flagship conferences (LAK, EDM, and AIED) over the past five years. We discovered, categorized, and analyzed 172 datasets used in 204 publications. Our study presents the most comprehensive collection and analysis of open educational datasets to date, along with the most detailed categorization. Of the 172 datasets identified, 143 were not captured in any prior survey of open data in learning analytics. We provide insights into the datasets' context, analytical methods, use, and other properties. Based on this survey, we summarize the current gaps in the field. Furthermore, we list practical recommendations, advice, and 8-item guidelines under the acronym PRACTICE with a checklist to help researchers publish their data. Lastly, we share our original dataset: an annotated inventory detailing the discovered datasets and the corresponding publications. We hope these findings will support further adoption of open data practices in learning analytics communities and beyond.

CYMay 12, 2025
LECTOR: Summarizing E-book Reading Content for Personalized Student Support

Erwin Daniel López Zapata, Cheng Tang, Valdemar Švábenský et al.

Educational e-book platforms provide valuable information to teachers and researchers through two main sources: reading activity data and reading content data. While reading activity data is commonly used to analyze learning strategies and predict low-performing students, reading content data is often overlooked in these analyses. To address this gap, this study proposes LECTOR (Lecture slides and Topic Relationships), a model that summarizes information from reading content in a format that can be easily integrated with reading activity data. Our first experiment compared LECTOR to representative Natural Language Processing (NLP) models in extracting key information from 2,255 lecture slides, showing an average improvement of 5% in F1-score. These results were further validated through a human evaluation involving 28 students, which showed an average improvement of 21% in F1-score over a model predominantly used in current educational tools. Our second experiment compared reading preferences extracted by LECTOR with traditional reading activity data in predicting low-performing students using 600,712 logs from 218 students. The results showed a tendency to improve the predictive performance by integrating LECTOR. Finally, we proposed examples showing the potential application of the reading preferences extracted by LECTOR in designing personalized interventions for students.

CRJan 5, 2022
Reinforcing Cybersecurity Hands-on Training With Adaptive Learning

Pavel Seda, Jan Vykopal, Valdemar Švábenský et al.

This paper presents how learning experience influences students' capability to learn and their motivation for learning. Although each student is different, standard instruction methods do not adapt to individuals. Adaptive learning reverses this practice and attempts to improve the student experience. While adaptive learning is well-established in programming, it is rarely used in cybersecurity education. This paper is one of the first works investigating adaptive learning in security training. First, we analyze the performance of 95 students in 12 training sessions to understand the limitations of the current training practice. Less than half of the students completed the training without displaying a solution, and only in two sessions, all students completed all phases. Then, we simulate how students would proceed in one of the past training sessions if it would offer more paths of various difficulty. Based on this simulation, we propose a novel tutor model for adaptive training, which considers students' proficiency before and during an ongoing training session. The proficiency is assessed using a pre-training questionnaire and various in-training metrics. Finally, we conduct a study with 24 students and new training using the proposed tutor model and adaptive training format. The results show that the adaptive training does not overwhelm students as the original static training. Adaptive training enables students to enter several alternative training phases with lower difficulty than the original training. The proposed format is not restricted to a particular training. Therefore, it can be applied to practicing any security topic or even in related fields, such as networking or operating systems. Our study indicates that adaptive learning is a promising approach for improving the student experience in security education. We also highlight implications for educational practice.

CRJan 5, 2021
Cybersecurity Knowledge and Skills Taught in Capture the Flag Challenges

Valdemar Švábenský, Pavel Čeleda, Jan Vykopal et al.

Capture the Flag challenges are a popular form of cybersecurity education, where students solve hands-on tasks in an informal, game-like setting. The tasks feature diverse assignments, such as exploiting websites, cracking passwords, and breaching unsecured networks. However, it is unclear how the skills practiced by these challenges match formal cybersecurity curricula defined by security experts. We explain the significance of Capture the Flag challenges in cybersecurity training and analyze their 15,963 textual solutions collected since 2012. Based on keywords in the solutions, we map them to well-established ACM/IEEE curricular guidelines to understand which skills the challenges teach. We study the distribution of cybersecurity topics, their variance in different challenge formats, and their development over the past years. The analysis showed the prominence of technical knowledge about cryptography and network security, but human aspects, such as social engineering and cybersecurity awareness, are neglected. We discuss the implications of these results and relate them to contemporary literature. Our results indicate that future Capture the Flag challenges should include non-technical aspects to address the current advanced cyber threats and attract a broader audience to cybersecurity.

CRApr 24, 2020
Benefits and Pitfalls of Using Capture the Flag Games in University Courses

Jan Vykopal, Valdemar Švábenský, Ee-Chien Chang

The concept of Capture the Flag (CTF) games for practicing cybersecurity skills is widespread in informal educational settings and leisure-time competitions. However, it is not much used in university courses. This paper summarizes our experience from using jeopardy CTF games as homework assignments in an introductory undergraduate course. Our analysis of data describing students' in-game actions and course performance revealed four aspects that should be addressed in the design of CTF tasks: scoring, scaffolding, plagiarism, and learning analytics capabilities of the used CTF platform. The paper addresses these aspects by sharing our recommendations. We believe that these recommendations are useful for cybersecurity instructors who consider using CTF games for assessment in university courses and developers of CTF game frameworks.

HCMar 7, 2020
Conceptual Model of Visual Analytics for Hands-on Cybersecurity Training

Radek Ošlejšek, Vít Rusňák, Karolína Burská et al.

Hands-on training is an effective way to practice theoretical cybersecurity concepts and increase participants' skills. In this paper, we discuss the application of visual analytics principles to the design, execution, and evaluation of training sessions. We propose a conceptual model employing visual analytics that supports the sensemaking activities of users involved in various phases of the training life cycle. The model emerged from our long-term experience in designing and organizing diverse hands-on cybersecurity training sessions. It provides a classification of visualizations and can be used as a framework for developing novel visualization tools supporting phases of the training life-cycle. We demonstrate the model application on examples covering two types of cybersecurity training programs.

HCDec 23, 2019
Visual Feedback for Players of Multi-Level Capture the Flag Games: Field Usability Study

Radek Ošlejšek, Vít Rusňák, Karolína Burská et al.

Capture the Flag games represent a popular method of cybersecurity training. Providing meaningful insight into the training progress is essential for increasing learning impact and supporting participants' motivation, especially in advanced hands-on courses. In this paper, we investigate how to provide valuable post-game feedback to players of serious cybersecurity games through interactive visualizations. In collaboration with domain experts, we formulated user requirements that cover three cognitive perspectives: gameplay overview, person-centric view, and comparative feedback. Based on these requirements, we designed two interactive visualizations that provide complementary views on game results. They combine a known clustering and time-based visual approaches to show game results in a way that is easy to decode for players. The purposefulness of our visual feedback was evaluated in a usability field study with attendees of the Summer School in Cyber Security. The evaluation confirmed the adequacy of the two visualizations for instant post-game feedback. Despite our initial expectations, there was no strong preference for neither of the visualizations in solving different tasks.

CRNov 26, 2019
What Are Cybersecurity Education Papers About? A Systematic Literature Review of SIGCSE and ITiCSE Conferences

Valdemar Švábenský, Jan Vykopal, Pavel Čeleda

Cybersecurity is now more important than ever, and so is education in this field. However, the cybersecurity domain encompasses an extensive set of concepts, which can be taught in different ways and contexts. To understand the state of the art of cybersecurity education and related research, we examine papers from the ACM SIGCSE and ACM ITiCSE conferences. From 2010 to 2019, a total of 1,748 papers were published at these conferences, and 71 of them focus on cybersecurity education. The papers discuss courses, tools, exercises, and teaching approaches. For each paper, we map the covered topics, teaching context, evaluation methods, impact, and the community of authors. We discovered that the technical topic areas are evenly covered (the most prominent being secure programming, network security, and offensive security), and human aspects, such as privacy and social engineering, are present as well. The interventions described in SIGCSE and ITiCSE papers predominantly focus on tertiary education in the USA. The subsequent evaluation mostly consists of collecting students' subjective perceptions via questionnaires. However, less than a third of the papers provide supplementary materials for other educators, and none of the authors published their dataset. Our results provide orientation in the area, a synthesis of trends, and implications for further research. Therefore, they are relevant for instructors, researchers, and anyone new in the field of cybersecurity education. The information we collected and synthesized from individual papers are organized in a publicly available dataset.

CYDec 8, 2017
Challenges Arising from Prerequisite Testing in Cybersecurity Games

Valdemar Švábenský, Jan Vykopal

Cybersecurity games are an attractive and popular method of active learning. However, the majority of current games are created for advanced players, which often leads to frustration in less experienced learners. Therefore, we decided to focus on a diagnostic assessment of participants entering the games. We assume that information about the players' knowledge, skills, and experience enables tutors or learning environments to suitably assist participants with game challenges and maximize learning in their virtual adventure. In this paper, we present a pioneering experiment examining the predictive value of a short quiz and self-assessment for identifying learners' readiness before playing a cybersecurity game. We hypothesized that these predictors would model players' performance. A linear regression analysis showed that the game performance can be accurately predicted by well-designed prerequisite testing, but not by self-assessment. At the same time, we identified major challenges related to the design of pretests for cybersecurity games: calibrating test questions with respect to the skills relevant for the game, minimizing the quiz's length while maximizing its informative value, and embedding the pretest in the game. Our results are relevant for educational researchers and cybersecurity instructors of students at all learning levels.