TII-SSRC-23 Dataset: Typological Exploration of Diverse Traffic Patterns for Intrusion Detection
This provides a more comprehensive dataset for researchers in network security to improve intrusion detection models, though it is incremental as it focuses on data rather than new methods.
The paper introduces TII-SSRC-23, a novel dataset designed to address the lack of diversity in existing network intrusion detection datasets, and establishes baselines for supervised and unsupervised methods using it.
The effectiveness of network intrusion detection systems, predominantly based on machine learning, are highly influenced by the dataset they are trained on. Ensuring an accurate reflection of the multifaceted nature of benign and malicious traffic in these datasets is essential for creating models capable of recognizing and responding to a wide array of intrusion patterns. However, existing datasets often fall short, lacking the necessary diversity and alignment with the contemporary network environment, thereby limiting the effectiveness of intrusion detection. This paper introduces TII-SSRC-23, a novel and comprehensive dataset designed to overcome these challenges. Comprising a diverse range of traffic types and subtypes, our dataset is a robust and versatile tool for the research community. Additionally, we conduct a feature importance analysis, providing vital insights into critical features for intrusion detection tasks. Through extensive experimentation, we also establish firm baselines for supervised and unsupervised intrusion detection methodologies using our dataset, further contributing to the advancement and adaptability of intrusion detection models in the rapidly changing landscape of network security. Our dataset is available at https://kaggle.com/datasets/daniaherzalla/tii-ssrc-23.