LGAISep 27, 2024

CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly Detection and Forecasting

arXiv:2409.18874v128 citationsh-index: 13
Originality Synthesis-oriented
AI Analysis

This dataset addresses a gap for network security researchers by offering a real-world benchmark to prevent overestimation in anomaly detection algorithms, though it is incremental as it focuses on data collection rather than new methods.

The authors tackled the lack of real-world network datasets for anomaly detection by introducing CESNET-TimeSeries24, a dataset from 40 weeks of traffic across 275,000 IP addresses, providing authentic variability for model evaluation.

Anomaly detection in network traffic is crucial for maintaining the security of computer networks and identifying malicious activities. One of the primary approaches to anomaly detection are methods based on forecasting. Nevertheless, extensive real-world network datasets for forecasting and anomaly detection techniques are missing, potentially causing performance overestimation of anomaly detection algorithms. This manuscript addresses this gap by introducing a dataset comprising time series data of network entities' behavior, collected from the CESNET3 network. The dataset was created from 40 weeks of network traffic of 275 thousand active IP addresses. The ISP origin of the presented data ensures a high level of variability among network entities, which forms a unique and authentic challenge for forecasting and anomaly detection models. It provides valuable insights into the practical deployment of forecast-based anomaly detection approaches.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes