LGCRApr 26, 2025

Performance of Machine Learning Classifiers for Anomaly Detection in Cyber Security Applications

arXiv:2504.18771v11 citationsh-index: 9Has Code
Originality Synthesis-oriented
AI Analysis

This is an incremental study comparing existing methods for anomaly detection in cybersecurity, relevant for practitioners handling imbalanced data.

This work empirically evaluates machine learning classifiers for anomaly detection in cybersecurity, finding that XGB and MLP outperform generative models like GAN and VAE on imbalanced datasets, with XGB and MLP achieving higher performance metrics.

This work empirically evaluates machine learning models on two imbalanced public datasets (KDDCUP99 and Credit Card Fraud 2013). The method includes data preparation, model training, and evaluation, using an 80/20 (train/test) split. Models tested include eXtreme Gradient Boosting (XGB), Multi Layer Perceptron (MLP), Generative Adversarial Network (GAN), Variational Autoencoder (VAE), and Multiple-Objective Generative Adversarial Active Learning (MO-GAAL), with XGB and MLP further combined with Random-Over-Sampling (ROS) and Self-Paced-Ensemble (SPE). Evaluation involves 5-fold cross-validation and imputation techniques (mean, median, and IterativeImputer) with 10, 20, 30, and 50 % missing data. Findings show XGB and MLP outperform generative models. IterativeImputer results are comparable to mean and median, but not recommended for large datasets due to increased complexity and execution time. The code used is publicly available on GitHub (github.com/markushaug/acr-25).

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes