CR AI LGOct 25, 2021

Bridging the gap to real-world for network intrusion detection systems with data-centric approach

Gustavo de Carvalho Bertoli, Lourenço Alves Pereira Junior, Filipe Alves Neto Verri, Aldri Luiz dos Santos, Osamu Saotome

arXiv:2110.13655v23.8Has Code

Originality Synthesis-oriented

AI Analysis

This addresses the limitation of aging datasets for researchers and practitioners in cybersecurity, though it appears incremental as it focuses on dataset generation rather than a new detection method.

The paper tackles the problem of outdated datasets in network intrusion detection systems (NIDS) research by proposing a data-centric approach to generate recent, labeled network traffic and attack datasets, aiming to bridge the gap to real-world applications.

Most research using machine learning (ML) for network intrusion detection systems (NIDS) uses well-established datasets such as KDD-CUP99, NSL-KDD, UNSW-NB15, and CICIDS-2017. In this context, the possibilities of machine learning techniques are explored, aiming for metrics improvements compared to the published baselines (model-centric approach). However, those datasets present some limitations as aging that make it unfeasible to transpose those ML-based solutions to real-world applications. This paper presents a systematic data-centric approach to address the current limitations of NIDS research, specifically the datasets. This approach generates NIDS datasets composed of the most recent network traffic and attacks, with the labeling process integrated by design.

View on arXiv PDF Code

Similar