Assessing Generalisation Capability of Machine Learning Models for Intrusion Detection
For cybersecurity practitioners, this work highlights the poor cross-dataset generalization of current ML-based intrusion detection systems, emphasizing the need for adaptive models.
The study evaluates supervised ML models for intrusion detection across datasets, finding that Random Forest achieves high same-dataset accuracy (95.08% on UNSW-NB15, 99.79% on TON_IoT) but drops below 40% in cross-dataset tests, revealing a significant generalization gap.
The growth of networked and IoT systems has intensified cyber-security threats and exposed the limits of traditional signature-based intrusion detection. Although machine-learning-based intrusion detection systems often report strong benchmark performance, high ac- curacy within a single dataset does not necessarily guarantee reliable performance in unseen network environments. This study investigates the generalisation capability of supervised machine learning models for intrusion detection using UNSW-NB15 and TON_IoT. Random Forest, Logistic Regression, and Naive Bayes were evaluated under same-dataset and cross-dataset settings. Random Forest achieved the strongest same dataset performance, with 95.08% accuracy on UNSW-NB15 and 99.79% on TON_IoT, but performance dropped sharply in cross-dataset testing. When trained on UNSW-NB15 and tested on TON_IoT or vice versa, below 40% accuracy. These results reveal a significant generalisation gap in intrusion detection. We connect this challenge to affective computing and human-centric AI, where behavioural signal analysis, anomaly detection, domain shift, and context-sensitive modelling are also central. This framing highlights the need for adaptive, generalisable cyber-security models that can operate across changing network and IoT environments.