CR AI LGJul 15, 2024

Impacts of Data Preprocessing and Hyperparameter Optimization on the Performance of Machine Learning Models Applied to Intrusion Detection Systems

Mateus Guimarães Lima, Antony Carvalho, João Gabriel Álvares, Clayton Escouper das Chagas, Ronaldo Ribeiro Goldschmidt

arXiv:2407.11105v12 citationsh-index: 2

Originality Synthesis-oriented

AI Analysis

This work addresses a gap in cybersecurity for improving intrusion detection systems, but it is incremental as it focuses on evaluating existing techniques rather than introducing new methods.

The study investigated how data preprocessing and hyperparameter optimization affect machine learning models for intrusion detection systems, finding that these techniques generally make classification models more robust and significantly reduce training and testing execution times.

In the context of cybersecurity of modern communications networks, Intrusion Detection Systems (IDS) have been continuously improved, many of them incorporating machine learning (ML) techniques to identify threats. Although there are researches focused on the study of these techniques applied to IDS, the state-of-the-art lacks works concentrated exclusively on the evaluation of the impacts of data pre-processing actions and the optimization of the values of the hyperparameters of the ML algorithms in the construction of the models of threat identification. This article aims to present a study that fills this research gap. For that, experiments were carried out with two data sets, comparing attack scenarios with variations of pre-processing techniques and optimization of hyperparameters. The results confirm that the proper application of these techniques, in general, makes the generated classification models more robust and greatly reduces the execution times of these models' training and testing processes.

View on arXiv PDF

Similar