CRAILGJul 15, 2024

Impacts of Data Preprocessing and Hyperparameter Optimization on the Performance of Machine Learning Models Applied to Intrusion Detection Systems

arXiv:2407.11105v12 citationsh-index: 2
Originality Synthesis-oriented
AI Analysis

This work addresses a gap in cybersecurity for improving intrusion detection systems, but it is incremental as it focuses on evaluating existing techniques rather than introducing new methods.

The study investigated how data preprocessing and hyperparameter optimization affect machine learning models for intrusion detection systems, finding that these techniques generally make classification models more robust and significantly reduce training and testing execution times.

In the context of cybersecurity of modern communications networks, Intrusion Detection Systems (IDS) have been continuously improved, many of them incorporating machine learning (ML) techniques to identify threats. Although there are researches focused on the study of these techniques applied to IDS, the state-of-the-art lacks works concentrated exclusively on the evaluation of the impacts of data pre-processing actions and the optimization of the values of the hyperparameters of the ML algorithms in the construction of the models of threat identification. This article aims to present a study that fills this research gap. For that, experiments were carried out with two data sets, comparing attack scenarios with variations of pre-processing techniques and optimization of hyperparameters. The results confirm that the proper application of these techniques, in general, makes the generated classification models more robust and greatly reduces the execution times of these models' training and testing processes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes